Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
218 changes: 218 additions & 0 deletions compiler/TRANSPILER_DEMO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
# W Language Transpiler

A transpiler that converts W language (Wolfram-like syntax) to idiomatic Rust code, leveraging Rust's compilation process for safety and performance.

## Implementation Strategy

Following the approach outlined in `REVISED_STRATEGY.md`, this transpiler:
- Converts W code directly to Rust source code
- Maps W types to Rust stdlib types (`List` → `Vec`, `Map` → `HashMap`)
- Inherits all of Rust's safety guarantees through rustc
- Generates idiomatic Rust code with proper naming conventions

## Features Implemented

### 1. Function Definitions
W syntax with type annotations transpiles to Rust functions:

**W Code:**
```wolfram
Square[x: int] := x * x
```

**Generated Rust:**
```rust
fn square(x: i32) -> i32 {
(x * x)
}
```

### 2. Function Calls
Built-in `Print` function maps to Rust's `println!`:

**W Code:**
```wolfram
Print["Hello, World!"]
```

**Generated Rust:**
```rust
fn main() {
println!("{}", "Hello, World!".to_string());
}
```

### 3. Lists → Vec
Lists automatically convert to Rust vectors:

**W Code:**
```wolfram
Print[[1, 2, 3, 4, 5]]
```

**Generated Rust:**
```rust
fn main() {
println!("{:?}", vec![1, 2, 3, 4, 5]);
}
```

### 4. Arithmetic Operations
Binary operations with proper precedence:

**W Code:**
```wolfram
Print[2 + 3 * 4]
```

**Generated Rust:**
```rust
fn main() {
println!("{}", ((2 + 3) * 4));
}
```

### 5. Multiple Arguments
Functions with multiple arguments:

**W Code:**
```wolfram
Print["The", "answer", "is", 42]
```

**Generated Rust:**
```rust
fn main() {
println!("{} {} {} {}", "The".to_string(), "answer".to_string(), "is".to_string(), 42);
}
```

### 6. Nested Function Calls
Functions can be composed:

**W Code:**
```wolfram
Print[Square[5]]
```

**Generated Rust:**
```rust
fn main() {
println!("{}", square(5));
}
```

## Type System

### Rust-like Defaults
Following Rust's conventions:
- Integer literals default to `i32` (not i64)
- Float literals default to `f64`
- Backward compatible: `int` → `i32`, `float` → `f64`

### Complete Type Mapping

| W Type | Rust Type | Description |
|--------|-----------|-------------|
| **Signed Integers** | | |
| `Int8` | `i8` | 8-bit signed integer |
| `Int16` | `i16` | 16-bit signed integer |
| `Int32` / `int` | `i32` | 32-bit signed (default) |
| `Int64` | `i64` | 64-bit signed integer |
| `Int128` | `i128` | 128-bit signed integer |
| `Int` | `isize` | Pointer-sized signed |
| **Unsigned Integers** | | |
| `UInt8` | `u8` | 8-bit unsigned integer |
| `UInt16` | `u16` | 16-bit unsigned integer |
| `UInt32` | `u32` | 32-bit unsigned integer |
| `UInt64` | `u64` | 64-bit unsigned integer |
| `UInt128` | `u128` | 128-bit unsigned integer |
| `UInt` | `usize` | Pointer-sized unsigned |
| **Floating Point** | | |
| `Float32` | `f32` | 32-bit float |
| `Float64` / `float` | `f64` | 64-bit float (default) |
| **Other Primitives** | | |
| `Bool` / `bool` | `bool` | Boolean |
| `Char` / `char` | `char` | Unicode scalar |
| `String` / `string` | `String` | Owned string |
| **Container Types** | | |
| `List[T]` | `Vec<T>` | Dynamic array |
| `Array[T, N]` | `[T; N]` | Fixed-size array |
| `Slice[T]` | `&[T]` | Slice reference |
| `Map[K,V]` | `HashMap<K, V>` | Hash map |
| `HashSet[T]` | `HashSet<T>` | Hash set |
| `BTreeMap[K,V]` | `BTreeMap<K, V>` | Sorted map |
| `BTreeSet[T]` | `BTreeSet<T>` | Sorted set |

### Examples

**Primitive Types:**
```wolfram
AddBytes[a: UInt8, b: UInt8] := a + b
BigNum[x: Int64] := x * 2
Precision[x: Float32] := x + 1.5
```

**Container Types:**
```wolfram
ProcessList[items: List[Int32]] := items (* Vec<i32> *)
FixedBuffer[arr: Array[UInt8, 256]] := arr (* [u8; 256] *)
ReadSlice[data: Slice[UInt8]] := data (* &[u8] *)
UniqueWords[words: HashSet[String]] := words (* HashSet<String> *)
SortedIndex[idx: BTreeMap[Int32, String]] := idx (* BTreeMap<i32, String> *)
OrderedSet[nums: BTreeSet[Int64]] := nums (* BTreeSet<i64> *)
```

**Backward Compatible (lowercase):**
```wolfram
Square[x: int] := x * x (* int → i32 *)
Average[x: float] := x / 2 (* float → f64 *)
```

## Naming Conventions

The transpiler follows Rust conventions:
- PascalCase function names → snake_case (e.g., `Square` → `square`)
- Type annotations preserved and mapped to Rust types

## Usage

```bash
cargo build
./target/debug/w <input_file.w>
# Generates generated.rs and compiles to ./output
./output
```

## Examples

See the `examples/` directory for more demonstrations:
- `hello_world.w` - Basic printing
- `arithmetic.w` - Mathematical operations
- `list_example.w` - List/Vec usage
- `function_def.w` - Function definitions
- `multiple_args.w` - Multiple function arguments

## Architecture

1. **Lexer** (`lexer.rs`): Tokenizes W source code
2. **Parser** (`parser.rs`): Builds AST from tokens with lookahead for disambiguation
3. **Code Generator** (`rust_codegen.rs`): Translates AST to Rust source
4. **Compiler** (`main.rs`): Coordinates the pipeline and invokes `rustc`

## Key Implementation Details

- **Parser lookahead**: Uses `peek_token()` to distinguish function calls from identifiers
- **Type inference**: Infers return types from function bodies
- **Debug formatting**: Automatically uses `{:?}` for types without `Display` trait
- **Expression vs Statement**: Properly generates Rust expressions without trailing semicolons

## Future Enhancements

Potential additions following the REVISED_STRATEGY.md vision:
- Pattern matching (`Match` expressions)
- Module system
- Ownership annotations (borrows, moves)
- Iterator/map operations
- Result/Option types for error handling
- Async/await support
25 changes: 25 additions & 0 deletions compiler/examples/all_containers.w
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
(*
This file demonstrates all container type annotations.
The function signatures show the W→Rust type mappings.
*)

(* Vec<i32> *)
UseList[items: List[Int32]] := items

(* [i32; 10] - fixed-size array *)
UseArray[buffer: Array[Int32, 10]] := buffer

(* &[u8] - slice reference *)
UseSlice[data: Slice[UInt8]] := data

(* HashMap<String, Int32> *)
UseHashMap[mapping: Map[String, Int32]] := mapping

(* HashSet<String> *)
UseHashSet[unique: HashSet[String]] := unique

(* BTreeMap<Int32, String> - sorted map *)
UseBTreeMap[sorted: BTreeMap[Int32, String]] := sorted

(* BTreeSet<Int64> - sorted set *)
UseBTreeSet[ordered: BTreeSet[Int64]] := ordered
1 change: 1 addition & 0 deletions compiler/examples/arithmetic.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Print[2 + 3 * 4]
1 change: 1 addition & 0 deletions compiler/examples/comprehensive_types.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Print["Testing Int8:", 127]
1 change: 1 addition & 0 deletions compiler/examples/containers_array.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
FixedArray[arr: Array[Int32, 5]] := arr
1 change: 1 addition & 0 deletions compiler/examples/containers_btreemap.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
SortedMap[data: BTreeMap[Int32, String]] := data
1 change: 1 addition & 0 deletions compiler/examples/containers_btreeset.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
SortedSet[items: BTreeSet[Int64]] := items
1 change: 1 addition & 0 deletions compiler/examples/containers_demo.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Print["Testing container type annotations"]
1 change: 1 addition & 0 deletions compiler/examples/containers_hashset.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
UniqueItems[items: HashSet[String]] := items
1 change: 1 addition & 0 deletions compiler/examples/containers_list.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ProcessList[items: List[Int32]] := items
1 change: 1 addition & 0 deletions compiler/examples/containers_slice.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ProcessSlice[data: Slice[UInt8]] := data
1 change: 1 addition & 0 deletions compiler/examples/function_def.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Square[x: int] := x * x
1 change: 1 addition & 0 deletions compiler/examples/function_with_call.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Print[Square[5]]
1 change: 1 addition & 0 deletions compiler/examples/list_example.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Print[[1, 2, 3, 4, 5]]
1 change: 1 addition & 0 deletions compiler/examples/multiple_args.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Print["The", "answer", "is", 42]
1 change: 1 addition & 0 deletions compiler/examples/types_demo.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
AddBytes[a: UInt8, b: UInt8] := a + b
1 change: 1 addition & 0 deletions compiler/examples/types_float32.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
FloatCalc[x: Float32, y: Float32] := x + y
1 change: 1 addition & 0 deletions compiler/examples/types_i64.w
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
BigNumber[x: Int64] := x * 2
3 changes: 3 additions & 0 deletions compiler/generated.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
fn use_btree_set(ordered: std::collections::BTreeSet<i64>) {
ordered
}
Binary file added compiler/output
Binary file not shown.
48 changes: 38 additions & 10 deletions compiler/src/ast.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,46 @@ pub enum LogLevel {
Error,
}

#[repr(u8)]
#[derive(Debug, Clone, PartialEq)]
#[allow(dead_code)]
pub enum Type {
Int = 0,
Float = 1,
String = 2,
Bool = 3,
List(Box<Type>) = 4,
Map(Box<Type>, Box<Type>) = 5,
Function(Vec<Type>, Box<Type>) = 6,
LogLevel = 7,
// Signed integers
Int8,
Int16,
Int32,
Int64,
Int128,
Int, // isize

// Unsigned integers
UInt8,
UInt16,
UInt32,
UInt64,
UInt128,
UInt, // usize

// Floating point
Float32,
Float64,

// Other primitives
Bool,
Char,
String,

// Complex types
List(Box<Type>), // Vec<T>
Array(Box<Type>, usize), // [T; N] - fixed size
Slice(Box<Type>), // &[T]
Map(Box<Type>, Box<Type>), // HashMap<K, V>
HashSet(Box<Type>), // HashSet<T>
BTreeMap(Box<Type>, Box<Type>), // BTreeMap<K, V>
BTreeSet(Box<Type>), // BTreeSet<T>
Function(Vec<Type>, Box<Type>),

// Special types
LogLevel,
}

#[allow(dead_code)]
Expand All @@ -31,7 +59,7 @@ pub struct TypeAnnotation {
#[allow(dead_code)]
#[derive(Debug)]
pub enum Expression {
Number(i64),
Number(i32), // Default to i32 like Rust
Float(f64),
String(String),
Boolean(bool),
Expand Down
Loading