Rkeramati · Rkeramati · Oct 6, 2025 · Oct 6, 2025
diff --git a/.github/classroom/autograding.json b/.github/classroom/autograding.json
@@ -0,0 +1,34 @@
+{
+  "tests": [
+    {
+      "name": "Task 3.1 - CPU Parallel Operations",
+      "setup": "pip install -e .",
+      "run": "python -m pytest -m task3_1 --tb=no -q",
+      "input": "",
+      "output": "",
+      "comparison": "included",
+      "timeout": 10,
+      "points": 25
+    },
+    {
+      "name": "Task 3.2 - CPU Matrix Multiplication",
+      "setup": "",
+      "run": "python -m pytest -m task3_2 --tb=no -q",
+      "input": "",
+      "output": "",
+      "comparison": "included",
+      "timeout": 10,
+      "points": 25
+    },
+    {
+      "name": "Style Check",
+      "setup": "",
+      "run": "python -m ruff check . && python -m pyright",
+      "input": "",
+      "output": "",
+      "comparison": "included",
+      "timeout": 10,
+      "points": 10
+    }
+  ]
+}
diff --git a/.github/workflows/classroom.yaml b/.github/workflows/classroom.yaml
@@ -0,0 +1,16 @@
+name: GitHub Classroom Workflow
+
+on: [push]
+
+permissions:
+  checks: write
+  actions: read
+  contents: read
+
+jobs:
+  build:
+    name: Autograding
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v2
+      - uses: education/autograding@v1
diff --git a/README.md b/README.md
@@ -1,32 +1,164 @@
-# MiniTorch Module 3
+# MiniTorch Module 3 - Parallel and GPU Acceleration
 
 <img src="https://minitorch.github.io/minitorch.svg" width="50%">
 
-* Docs: https://minitorch.github.io/
+**Documentation:** https://minitorch.github.io/
 
-* Overview: https://minitorch.github.io/module3.html
+**Overview (Required reading):** https://minitorch.github.io/module3.html
 
+## Overview
 
-You will need to modify `tensor_functions.py` slightly in this assignment.
+Module 3 focuses on **optimizing tensor operations** through parallel computing and GPU acceleration. You'll implement CPU parallel operations using Numba and GPU kernels using CUDA, achieving dramatic performance improvements over the sequential tensor backend from Module 2.
 
-* Tests:
+### Key Learning Goals
+- **CPU Parallelization**: Implement parallel tensor operations with Numba
+- **GPU Programming**: Write CUDA kernels for tensor operations
+- **Performance Optimization**: Achieve significant speedup through hardware acceleration
+- **Matrix Multiplication**: Optimize the most computationally intensive operations
+- **Backend Architecture**: Build multiple computational backends for flexible performance
 
+## Tasks Overview
+
+| Task    | Description 
+|---------|-------------
+| **3.1** | CPU Parallel Operations (`fast_ops.py`) 
+| **3.2** | CPU Matrix Multiplication (`fast_ops.py`) 
+| **3.3** | GPU Operations (`cuda_ops.py`)
+| **3.4** | GPU Matrix Multiplication (`cuda_ops.py`)
+| **3.5** | Performance Evaluation (`run_fast_tensor.py`)
+
+## Documentation
+
+- **[Installation Guide](installation.md)** - Setup instructions including GPU configuration
+- **[Testing Guide](testing.md)** - How to run tests locally and handle GPU requirements
+
+## Quick Start
+
+### 1. Environment Setup
+```bash
+# Clone and navigate to your assignment
+git clone <your-assignment-repo>
+cd <assignment-directory>
+
+# Create virtual environment (recommended)
+conda create --name minitorch python
+conda activate minitorch
+
+# Install dependencies
+pip install -e ".[dev,extra]"
+```
+
+### 2. Sync Previous Module Files
+```bash
+# Sync required files from your Module 2 solution
+python sync_previous_module.py <path-to-module-2> .
+
+# Example:
+python sync_previous_module.py ../Module-2 .
+```
+
+### 3. Run Tests
+```bash
+# CPU tasks (run anywhere)
+pytest -m task3_1  # CPU parallel operations
+pytest -m task3_2  # CPU matrix multiplication
+
+# GPU tasks (require CUDA-compatible GPU)
+pytest -m task3_3  # GPU operations
+pytest -m task3_4  # GPU matrix multiplication
+
+# Style checks
+pre-commit run --all-files
+```
+
+## GPU Setup
+
+### Option 1: Google Colab (Recommended)
+Most students should use Google Colab for GPU tasks:
+
+1. Upload assignment files to Colab
+2. Change runtime to GPU (Runtime → Change runtime type → GPU)
+3. Install packages:
+   ```python
+   !pip install -e ".[dev,extra]"
+   !python -c "import numba.cuda; print('CUDA available:', numba.cuda.is_available())"
+   ```
+
+### Option 2: Local GPU (If you have NVIDIA GPU)
+For students with NVIDIA GPUs and CUDA-compatible hardware:
+
+```bash
+# Install CUDA toolkit
+# Visit: https://developer.nvidia.com/cuda-downloads
+
+# Install GPU packages
+pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
+pip install numba[cuda]
+
+# Verify GPU support
+python -c "import numba.cuda; print('CUDA available:', numba.cuda.is_available())"
 ```
-python run_tests.py
+
+## Testing Strategy
+
+### CI/CD (GitHub Actions)
+- **Task 3.1**: CPU parallel operations
+- **Task 3.2**: CPU matrix multiplication  
+- **Style Check**: Code quality and formatting
+
+### GPU Testing (Colab/Local GPU)
+- **Task 3.3**: GPU operations (use Colab or local NVIDIA GPU)
+- **Task 3.4**: GPU matrix multiplication (use Colab or local NVIDIA GPU)
+
+### Performance Validation
+```bash
+# Compare backend performance
+python project/run_fast_tensor.py    # Optimized backends
+python project/run_tensor.py         # Basic tensor backend
+python project/run_scalar.py         # Scalar baseline
 ```
 
-* Note:
+## Development Tools
 
-Several of the tests for this assignment will only run if you are on a GPU machine and will not
-run on github's test infrastructure. Please follow the instructions to setup up a colab machine
-to run these tests.
+### Code Quality
+```bash
+# Automatic style checking
+pre-commit install
+git commit -m "your changes"  # Runs style checks automatically
 
-This assignment requires the following files from the previous assignments. You can get these by running
+# Manual style checks
+ruff check .      # Linting
+ruff format .     # Formatting
+pyright .         # Type checking
+```
 
+### Debugging
 ```bash
-python sync_previous_module.py previous-module-dir current-module-dir
+# Debug Numba JIT issues
+NUMBA_DISABLE_JIT=1 pytest -m task3_1 -v
+
+# Debug CUDA kernels
+NUMBA_CUDA_DEBUG=1 pytest -m task3_3 -v
+
+# Monitor GPU usage
+nvidia-smi -l 1  # Update every second
 ```
 
-The files that will be synced are:
+## Implementation Focus
+
+### Task 3.1 & 3.2 (CPU Optimization)
+- Implement `tensor_map`, `tensor_zip`, `tensor_reduce` with Numba parallel loops
+- Optimize matrix multiplication with efficient loop ordering
+- Focus on cache locality and parallel execution patterns
+
+### Task 3.3 & 3.4 (GPU Acceleration)  
+- Write CUDA kernels for element-wise operations
+- Implement efficient GPU matrix multiplication with shared memory
+- Optimize thread block organization and memory coalescing
+
+## Important Notes
 
-        minitorch/tensor_data.py minitorch/tensor_functions.py minitorch/tensor_ops.py minitorch/operators.py minitorch/scalar.py minitorch/scalar_functions.py minitorch/module.py minitorch/autodiff.py minitorch/module.py project/run_manual.py project/run_scalar.py project/run_tensor.py minitorch/operators.py minitorch/module.py minitorch/autodiff.py minitorch/tensor.py minitorch/datasets.py minitorch/testing.py minitorch/optim.py
+- **GPU Limitations**: Tasks 3.3 and 3.4 cannot run in GitHub CI due to hardware requirements
+- **GPU Testing**: Use Google Colab (recommended) or local NVIDIA GPU for GPU tasks
+- **Performance Critical**: Implementations must show measurable speedup over sequential versions
+- **Memory Management**: Be careful with GPU memory allocation and deallocation
diff --git a/installation.md b/installation.md
@@ -0,0 +1,142 @@
+# MiniTorch Module 3 Installation
+
+MiniTorch requires Python 3.8 or higher. To check your version of Python, run:
+
+```bash
+>>> python --version
+```
+
+We recommend creating a global MiniTorch workspace directory that you will use
+for all modules:
+
+```bash
+>>> mkdir workspace; cd workspace
+```
+
+## Environment Setup
+
+We highly recommend setting up a *virtual environment*. The virtual environment lets you install packages that are only used for your assignments and do not impact the rest of the system.
+
+**Option 1: Anaconda (Recommended)**
+```bash
+>>> conda create --name minitorch python    # Run only once
+>>> conda activate minitorch
+>>> conda install llvmlite                  # For optimization
+```
+
+**Option 2: Venv**
+```bash
+>>> python -m venv venv          # Run only once
+>>> source venv/bin/activate
+```
+
+The first line should be run only once, whereas the second needs to be run whenever you open a new terminal to get started for the class. You can tell if it works by checking if your terminal starts with `(minitorch)` or `(venv)`.
+
+## Getting the Code
+
+Each assignment is distributed through a Git repo. Once you accept the assignment from GitHub Classroom, a personal repository under Cornell-Tech-ML will be created for you. You can then clone this repository to start working on your assignment.
+
+```bash
+>>> git clone {{ASSIGNMENT}}
+>>> cd {{ASSIGNMENT}}
+```
+
+## Syncing Previous Module Files
+
+Module 3 requires files from Module 0, Module 1, and Module 2. Sync them using:
+
+```bash
+>>> python sync_previous_module.py <path-to-module-2> <path-to-current-module>
+```
+
+Example:
+```bash
+>>> python sync_previous_module.py ../Module-2 .
+```
+
+Replace `<path-to-module-2>` with the path to your Module 2 directory and `<path-to-current-module>` with `.` for the current directory.
+
+This will copy the following required files:
+- `minitorch/tensor_data.py`
+- `minitorch/tensor_functions.py`
+- `minitorch/tensor_ops.py`
+- `minitorch/operators.py`
+- `minitorch/scalar.py`
+- `minitorch/scalar_functions.py`
+- `minitorch/module.py`
+- `minitorch/autodiff.py`
+- `minitorch/tensor.py`
+- `minitorch/datasets.py`
+- `minitorch/testing.py`
+- `minitorch/optim.py`
+- `project/run_manual.py`
+- `project/run_scalar.py`
+- `project/run_tensor.py`
+
+## Installation
+
+Install all packages in your virtual environment:
+
+```bash
+>>> python -m pip install -e ".[dev,extra]"
+```
+
+## GPU Setup (Required for Tasks 3.3 and 3.4)
+
+Tasks 3.3 and 3.4 require GPU support and won't run on GitHub CI.
+
+### Option 1: Google Colab (Recommended)
+
+Most students should use Google Colab as it provides free GPU access:
+
+1. Upload your assignment files to Colab
+2. Change runtime to GPU (Runtime → Change runtime type → GPU)
+3. Install packages in Colab:
+   ```python
+   !pip install -e ".[dev,extra]"
+   !python -c "import numba.cuda; print('CUDA available:', numba.cuda.is_available())"
+   ```
+
+### Option 2: Local GPU Setup (If you have NVIDIA GPU)
+
+For students with NVIDIA GPUs and CUDA-compatible hardware:
+
+1. **Install CUDA Toolkit**
+   ```bash
+   # Visit: https://developer.nvidia.com/cuda-downloads
+   # Follow instructions for your OS
+   ```
+
+2. **Verify CUDA Installation**
+   ```bash
+   >>> nvcc --version
+   >>> nvidia-smi
+   ```
+
+3. **Install GPU-compatible packages**
+   ```bash
+   >>> pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
+   >>> pip install numba[cuda]
+   ```
+
+## Verification
+
+Make sure everything is installed by running:
+
+```bash
+>>> python -c "import minitorch; print('Success!')"
+```
+
+Verify that the tensor functionality is available:
+
+```bash
+>>> python -c "from minitorch import tensor; print('Module 3 ready!')"
+```
+
+Check if CUDA support is available (for GPU tasks):
+
+```bash
+>>> python -c "import numba.cuda; print('CUDA available:', numba.cuda.is_available())"
+```
+
+You're ready to start Module 3!