caraml is a compiler for a sweet and simple language inspired by Standard ML, written in Haskell.
Try it out online: playground.caraml.valerianclerc.com Playground Source: github.com/ValerianClerc/caraml-playground
- Types: Strong, static typing with type inference. Supports
intandbool. - Functions: Recursive functions, multi-argument support.
- Control Flow:
if-then-elseexpressions. - Bindings:
letexpressions for variable scoping. - Compilation: Compiles to LLVM IR for efficient execution.
caraml leverages the power of LLVM to achieve cross-platform compatibility. By compiling to LLVM Intermediate Representation (IR), caraml programs can be compiled to native machine code for virtually any architecture supported by LLVM, including:
- x86_64 (Linux, macOS, Windows)
- ARM64 (Apple Silicon, Mobile, Raspberry Pi)
- WebAssembly (Run in the browser, like in the online playground)
This decoupling of the frontend language from the backend machine code generation ensures that caraml is both portable and highly optimized.
caraml follows a classic compiler pipeline architecture:
- Lexer (
src/Lexer.hs): Tokenizes the source code. - Parser (
src/Parser.hs): Constructs an Abstract Syntax Tree (AST) from tokens. - Type Inference (
src/TypeInfer.hs): Annotates the AST with types, ensuring type safety. - LLVM Codegen (
src/ToLlvm.hs): Translates the typed AST into LLVM IR.
The language grammar is defined in Extended Backus-Naur Form (EBNF). You can find the full specification in design/mini_ebnf.txt.
fun fib(n: int) =
if n <= 1 then n
else fib(n-1) + fib(n-2);
let result = fib(10);fun fact(n: int) =
if n = 0 then 1
else n * fact(n - 1);Demonstrates multiple arguments and arithmetic logic (using integer division for modulo).
fun gcd(a: int, b: int) =
if b = 0 then a
else gcd(b, a - (a / b) * b);
let result = gcd(1071, 462);To run test suite, run
cabal testTo compile a code file at the path test/file.cml, run
cabal run caraml-exe -- test/file.cmlTo run a specific step of the compiler, specify it with a flag:
cabal run caraml-exe -- --lexer test/file.cmlThe generated LLVM IR calls small runtime helpers (printint, printbool). When you install the caraml executable, the C runtime source file (runtime.c) is shipped as a data file. You can retrieve a copy at any time with:
cabal run caraml-exe -- --emit-runtime runtime.cThen compile your program like so:
# Generate LLVM IR
cabal run caraml-exe -- --llvm test/file.cml # produces test/file.ll
# Build native binary (clang will compile both IR and runtime C)
clang -Wno-override-module -lm test/file.ll runtime.c -o a.outOr run directly via the built-in driver (which automatically locates the installed runtime):
cabal run caraml-exe -- --compile-and-run test/file.cml.github/workflows/ # Github Actions config
app/ # executable file
design/ # grammar definition of the language
src/ # majority of source code
test/ # test suite
caraml.cabal # config file for the cabal package
- clang
- llvm version 15? (or could be just build-time?)
-
Currently I'm using GHC 8.10.7, because the llvm-hs-pretty package has some breaking components. In the future, I'd like to move back to GHC 9. Here are the things that I'd need to do:
- Switch the llvm-hs-pretty branch to the llvm-12 branch
- Ensure that I'm also using the llvm-12 branch of llvm-hs-pure
-
Add make script:
clang -Wno-override-module -lm test.ll test.c -o a.out
- Compile runtime:
docker run --rm -v $(pwd):/src -u $(id -u):$(id -g) emscripten/emsdk emcc -O2 runtime/runtime.c -c -o runtime/runtime.o
- Compile code:
docker run --rm -v $(pwd):/src -u $(id -u):$(id -g) emscripten/emsdk emcc -O2 --emrun runtime/runtime.o test/file.ll -o fib.html
Unconfirmed: Compile code to be called in React:
docker run --rm -v "$(pwd)":/src -u "$(id -u)":"$(id -g)" emscripten/emsdk emcc test/file.ll -O3 -s WASM=1 -s MODULARIZE=1 -s EXPORT_NAME="createExec" -s INVOKE_RUN=0 -s EXIT_RUNTIME=1 -s ALLOW
_MEMORY_GROWTH=1 runtime/runtime.o -o exec.js