A modern C implementation of Microsoft's GW-BASIC interpreter (1983), converted from the original x86 assembly language source code. This project serves as both a functional BASIC interpreter and an educational resource for understanding assembly-to-C conversion principles.
This project demonstrates how to systematically convert legacy x86 assembly code to portable, maintainable C code while preserving original functionality. It's designed for:
- Students learning interpreter implementation
- Developers working with legacy codebases
- Educators teaching assembly-to-C conversion
- Retro computing enthusiasts exploring 1980s programming languages
A complete BASIC interpreter written in C with comprehensive educational documentation:
- Tokenizer - Lexical analysis of BASIC source code
- Parser - Syntactic analysis and program structure
- Evaluator - Expression evaluation with operator precedence
- Interpreter - Main execution engine with control flow
- Statements - BASIC statement implementations (PRINT, FOR, IF, etc.)
- Utilities - Type system and memory management
The complete original Microsoft GW-BASIC assembly source code (1983) for reference and comparison.
Educational BASIC programs demonstrating interpreter features:
fibonacci.bas- Fibonacci sequence generationmath.bas- Arithmetic operationsloops.bas- FOR/NEXT loop constructs- And more...
- macOS with Xcode command line tools
- GCC or Clang compiler
- Make
cd gwbasic-c
make# Run a sample program
./gwbasic gw-basic-programs/fibonacci.bas
# Interactive mode
./gwbasic
Ok
10 PRINT "Hello, World!"
20 END
RUN# Test all sample programs
for file in gw-basic-programs/*.bas; do
echo "Testing $file"
./gwbasic "$file"
doneThis implementation demonstrates key principles for converting x86 assembly to C:
-
Data Structure Evolution
- ASM: Raw memory segments, manual offset calculations
- C: Structured types with automatic memory layout
-
Control Flow Translation
- ASM: JMP/CALL instructions with manual stack management
- C: Structured functions with automatic stack handling
-
Memory Management Modernization
- ASM: Manual allocation with no safety checks
- C: Structured malloc/free with ownership rules
Every C module maps directly to original assembly files:
| Original ASM | C Implementation | Purpose |
|---|---|---|
GWMAIN.asm |
main.c, interpreter.c |
Main execution loop |
GWEVAL.asm |
evaluator.c |
Expression evaluation |
GWSTS.asm |
statements.c |
Statement execution |
GWDATA.asm |
utilities.c |
Data type handling |
BINTRP.h |
gwbasic.h |
Core data structures |
The implementation showcases critical C memory management patterns:
- Deep vs. Shallow Copying - Preventing double-free errors
- Resource Ownership - Clear memory lifecycle management
- Defensive Programming - NULL checks and bounds validation
- Structured Cleanup - Systematic resource deallocation
PRINT- Output with formatting controlLET- Variable assignmentFOR/NEXT- Loop constructs with STEP supportIF/THEN- Conditional executionGOTO/GOSUB/RETURN- Control flow statementsEND- Program termination
- INTEGER - 32-bit signed integers
- SINGLE - 32-bit floating point
- DOUBLE - 64-bit floating point
- STRING - Variable-length character strings
RUN- Execute the current programLIST- Display program listingNEW- Clear current programHELP- Show available commands
- Start with
gwbasic-c/TEACHING_GUIDE.md - Examine the sample programs in
gw-basic-programs/ - Read the comprehensive header documentation in
gwbasic.h
- Compare C implementations with original ASM files
- Study memory management patterns in
utilities.c - Analyze interpreter architecture in
interpreter.c
- Trace Execution - Follow a simple program through tokenization β parsing β execution
- Memory Analysis - Study the FOR loop stack management
- Add Features - Implement new BASIC statements or data types
This implementation solves several critical memory safety issues common in ASM-to-C conversion:
- Double-Free Prevention - Using
copy_value()for safe value sharing - Ownership Clarity - Clear rules about who owns and frees memory
- Stack Safety - Proper cleanup of control flow stacks
While prioritizing educational clarity over raw performance, the implementation includes:
- Efficient tokenization with minimal memory allocation
- Structured variable lookup (easily upgradeable to hash tables)
- Minimal copying in the critical execution path
All code includes comprehensive documentation focusing on:
- Original ASM correspondence - How each C function relates to assembly
- Teaching concepts - Interpreter theory and implementation patterns
- Memory safety - Critical patterns for preventing common C pitfalls
- Type system design - How dynamic languages handle multiple data types
This is an educational project focused on demonstrating assembly-to-C conversion principles. Contributions that enhance the educational value are welcome:
- Improved documentation and teaching examples
- Additional sample BASIC programs
- Enhanced error messages and debugging features
- Performance optimizations with educational explanations
This project contains:
- Original GW-BASIC source: Microsoft's original assembly code (see LICENSE)
- C implementation: Educational conversion with comprehensive documentation
- Microsoft for releasing the original GW-BASIC source code
- Bill Gates, Paul Allen, Rich Murphey, Dan Illowsky, and others who worked on the original implementations
- The retrocomputing community for preserving computing history
- Original GW-BASIC announcement
- Microsoft's GW-BASIC source release
- Assembly Language Programming tutorials
- C Programming best practices
"The best way to understand how computers work is to build one yourself." - This project lets you build an interpreter and understand the evolution from assembly to modern C programming.