diff --git a/INSTALL.md b/INSTALL.md
new file mode 100644
index 0000000..1a6d369
--- /dev/null
+++ b/INSTALL.md
@@ -0,0 +1,166 @@
+# HyperGen Installation Guide
+
+## Quick Install
+
+### From PyPI (Recommended)
+
+```bash
+pip install hypergen
+```
+
+This installs both the Python library and the `hypergen` CLI command.
+
+### From Source
+
+```bash
+git clone https://github.com/ntegrals/hypergen.git
+cd hypergen
+pip install -e .
+```
+
+## Verify Installation
+
+After installation, verify everything works:
+
+```bash
+# Check CLI is available
+hypergen --version
+hypergen --help
+
+# Test Python import
+python -c "from hypergen import model, dataset; print('✓ HyperGen installed successfully')"
+```
+
+## CLI Commands
+
+### Serve Command
+
+Start an OpenAI-compatible API server:
+
+```bash
+# Basic usage
+hypergen serve stabilityai/stable-diffusion-xl-base-1.0
+
+# With options
+hypergen serve stabilityai/sdxl-turbo \
+  --port 8000 \
+  --api-key your-secret-key \
+  --dtype float16
+```
+
+**All options:**
+```bash
+hypergen serve --help
+```
+
+## For Developers
+
+### Building from Source
+
+```bash
+# Install build tools
+pip install build twine hatchling
+
+# Build the package
+python -m build
+
+# Install locally
+pip install dist/*.whl
+
+# Or install in editable mode
+pip install -e .
+```
+
+### Publishing to PyPI
+
+```bash
+# Build
+python -m build
+
+# Upload to PyPI
+python -m twine upload dist/*
+```
+
+## Troubleshooting
+
+### CLI Command Not Found
+
+If `hypergen` command is not found after installation:
+
+1. **Check Python scripts directory is in PATH:**
+   ```bash
+   python -m site --user-base
+   ```
+   Add `<site-base>/bin` to your PATH.
+
+2. **Reinstall:**
+   ```bash
+   pip install --force-reinstall hypergen
+   ```
+
+3. **For editable installs:**
+   ```bash
+   # Make sure you're in the project root
+   cd hypergen
+   pip install -e .
+   ```
+
+4. **Verify entry point:**
+   ```bash
+   python -m hypergen.cli.main --help
+   ```
+
+### Import Errors
+
+If you get import errors:
+
+```bash
+# Make sure you're not in the source directory
+cd ~
+
+# Try importing
+python -c "from hypergen import model"
+```
+
+### Dependencies Issues
+
+Install all dependencies manually:
+
+```bash
+pip install torch diffusers transformers peft accelerate \
+  fastapi uvicorn pillow numpy safetensors
+```
+
+## Requirements
+
+- Python 3.10 or higher
+- CUDA-capable GPU (for training/inference)
+- 12GB+ VRAM recommended for training
+
+## Optional Dependencies
+
+### Flash Attention
+```bash
+pip install hypergen[flash]
+```
+
+### xFormers
+```bash
+pip install hypergen[xformers]
+```
+
+### DeepSpeed
+```bash
+pip install hypergen[deepspeed]
+```
+
+### All Optional Dependencies
+```bash
+pip install hypergen[all]
+```
+
+## Getting Help
+
+- [GitHub Issues](https://github.com/ntegrals/hypergen/issues)
+- [Documentation](https://github.com/ntegrals/hypergen)
+- [Examples](examples/)
diff --git a/README.md b/README.md
index ccd1081..8d17134 100644
--- a/README.md
+++ b/README.md
@@ -45,17 +45,26 @@ Try HyperGen in interactive Jupyter notebooks:
 
 ## ⚡ Quickstart
 
+### Install from PyPI
 ```bash
 pip install hypergen
 ```
 
-### From Source
+**Note:** This will install both the Python library and the `hypergen` CLI command.
+
+### Install from Source
 ```bash
 git clone https://github.com/ntegrals/hypergen.git
 cd hypergen
 pip install -e .
 ```
 
+**After installation, verify the CLI is available:**
+```bash
+hypergen --version
+hypergen --help
+```
+
 ## 🎯 Supported Models
 
 | Model Family | Model ID | Type |
@@ -136,7 +145,9 @@ images = m.generate(
 
 HyperGen provides a production-ready API server with request queuing, similar to vLLM.
 
-### Start the Server
+### CLI Command
+
+After installing HyperGen, the `hypergen` CLI command is available globally:
 
 ```bash
 # Basic serving
@@ -156,6 +167,11 @@ hypergen serve black-forest-labs/FLUX.1-dev \
   --max-batch-size 4
 ```
 
+**Available CLI Options:**
+```bash
+hypergen serve --help
+```
+
 ### Use with OpenAI Client
 
 ```python
@@ -183,6 +199,17 @@ response = client.images.generate(
 - Optional API key authentication
 - Production-ready (FastAPI + uvicorn)
 
+**Test the API:**
+```bash
+# Start the server
+hypergen serve stabilityai/sdxl-turbo --port 8000
+
+# Run the test script (in another terminal)
+python examples/test_endpoint.py
+```
+
+See [test_endpoint.py](examples/test_endpoint.py) for comprehensive endpoint testing.
+
 ## ⭐ Key Features
 
 - **Dead Simple API**: Train LoRAs in 5 lines of code - simple for beginners, powerful for experts
@@ -204,7 +231,10 @@ Code samples in the [examples/](examples/) directory:
 
 - [quickstart.py](examples/quickstart.py) - Minimal 5-line training example
 - [complete_example.py](examples/complete_example.py) - All features demonstrated
-- [serve_client.py](examples/serve_client.py) - API client usage examples
+- [serve_client.py](examples/serve_client.py) - API client usage with OpenAI SDK
+- [test_endpoint.py](examples/test_endpoint.py) - Comprehensive API endpoint testing
+
+See the [examples/README.md](examples/README.md) for detailed documentation.
 
 ## 🛣️ Roadmap
 
@@ -243,12 +273,26 @@ hypergen/
 
 ## 💾 Installation
 
-### Basic Installation
+### Install from PyPI (Recommended)
 ```bash
 pip install hypergen
 ```
 
-### From Source
+This installs:
+- ✅ The `hypergen` Python library
+- ✅ The `hypergen` CLI command (globally available)
+
+**Verify installation:**
+```bash
+# Check CLI is available
+hypergen --version
+hypergen --help
+
+# Test in Python
+python -c "from hypergen import model, dataset; print('✓ HyperGen installed')"
+```
+
+### Install from Source
 ```bash
 git clone https://github.com/ntegrals/hypergen.git
 cd hypergen
@@ -257,6 +301,13 @@ pip install -e .
 
 **Requirements**: Python 3.10+
 
+### Troubleshooting
+
+If `hypergen` command is not found after installation:
+1. Ensure your Python scripts directory is in PATH
+2. Try reinstalling: `pip install --force-reinstall hypergen`
+3. For editable installs, use: `pip install -e .` (not `pip install -e src/`)
+
 ## 🤝 Contributing
 
 Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
diff --git a/examples/README.md b/examples/README.md
new file mode 100644
index 0000000..2e58a94
--- /dev/null
+++ b/examples/README.md
@@ -0,0 +1,250 @@
+# HyperGen Examples
+
+Example scripts demonstrating HyperGen's features.
+
+## Quick Start Examples
+
+### [quickstart.py](quickstart.py)
+
+**Minimal 5-line example** for training a LoRA.
+
+```bash
+python examples/quickstart.py
+```
+
+**What it does:**
+- Loads SDXL model
+- Loads a local dataset
+- Trains a LoRA in 1000 steps
+- Generates a test image
+
+---
+
+### [complete_example.py](complete_example.py)
+
+**Comprehensive example** showing all HyperGen features.
+
+```bash
+python examples/complete_example.py
+```
+
+**What it covers:**
+- Loading different models (SDXL, SD3, FLUX)
+- Dataset preparation with captions
+- Advanced training options
+- Image generation
+- Saving and loading checkpoints
+
+---
+
+## API Server Examples
+
+### Starting the Server
+
+Start a HyperGen API server:
+
+```bash
+# Basic server
+hypergen serve stabilityai/sdxl-turbo
+
+# With options
+hypergen serve stabilityai/stable-diffusion-xl-base-1.0 \
+  --port 8000 \
+  --api-key your-secret-key \
+  --lora ./my_lora
+```
+
+---
+
+### [serve_client.py](serve_client.py)
+
+**OpenAI-compatible client** for the HyperGen API.
+
+```bash
+# Start server first
+hypergen serve stabilityai/sdxl-turbo --port 8000
+
+# Then run client
+python examples/serve_client.py
+```
+
+**What it demonstrates:**
+- Using OpenAI Python client with HyperGen
+- Generating images via API
+- Handling responses
+- Error handling
+
+---
+
+### [test_endpoint.py](test_endpoint.py)
+
+**Comprehensive endpoint testing** script.
+
+```bash
+# Start server first
+hypergen serve stabilityai/sdxl-turbo --port 8000
+
+# Run tests
+python examples/test_endpoint.py
+```
+
+**What it tests:**
+- ✓ Health check endpoint (`/health`)
+- ✓ Model listing endpoint (`/v1/models`)
+- ✓ Image generation with base64 response
+- ✓ Image generation with URL response
+- ✓ Error handling
+- ✓ Timeout handling
+
+**Features:**
+- Automatic image saving
+- Pretty-printed test results
+- Summary report
+- Exit codes for CI/CD integration
+
+---
+
+## Dataset Examples
+
+### Local Dataset with Captions
+
+Create a dataset directory:
+
+```
+my_dataset/
+├── image1.jpg
+├── image1.txt    # "a beautiful sunset"
+├── image2.jpg
+├── image2.txt    # "a mountain landscape"
+└── ...
+```
+
+Load and use:
+
+```python
+from hypergen import dataset
+
+ds = dataset.load("./my_dataset")
+print(f"Loaded {len(ds)} images with captions")
+```
+
+---
+
+### HuggingFace Dataset
+
+See the [Jupyter notebooks](../notebooks/) for examples using HuggingFace datasets:
+- [minimal_example.ipynb](../notebooks/minimal_example.ipynb)
+- [train_lora_quickstart.ipynb](../notebooks/train_lora_quickstart.ipynb)
+
+---
+
+## Advanced Usage
+
+### Custom Training Configuration
+
+```python
+from hypergen import model, dataset
+
+m = model.load("stabilityai/stable-diffusion-xl-base-1.0")
+m.to("cuda")
+ds = dataset.load("./my_images")
+
+lora = m.train_lora(
+    ds,
+    steps=2000,
+    learning_rate=5e-5,
+    rank=32,
+    alpha=64,
+    batch_size=2,
+    gradient_accumulation_steps=4,
+    save_steps=500,
+    output_dir="./checkpoints",
+    resolution=1024,  # Train at higher resolution
+)
+```
+
+### Loading a Trained LoRA
+
+```python
+from peft import PeftModel
+
+# Load base model
+m = model.load("stabilityai/stable-diffusion-xl-base-1.0")
+m.to("cuda")
+
+# Load LoRA weights
+checkpoint_path = "./checkpoints/checkpoint-2000"
+m.pipeline.unet = PeftModel.from_pretrained(
+    m.pipeline.unet,
+    checkpoint_path
+)
+
+# Generate with LoRA
+image = m.generate("your prompt here")
+```
+
+---
+
+## Testing Tips
+
+### Quick Server Test
+
+```bash
+# Terminal 1: Start server
+hypergen serve stabilityai/sdxl-turbo
+
+# Terminal 2: Test with curl
+curl http://localhost:8000/health
+```
+
+### Performance Testing
+
+```bash
+# Install Apache Bench
+brew install httpd  # macOS
+sudo apt-get install apache2-utils  # Linux
+
+# Run load test
+ab -n 100 -c 10 http://localhost:8000/health
+```
+
+---
+
+## Common Issues
+
+### Out of Memory
+
+**Solution:** Reduce batch size or resolution
+```python
+lora = m.train_lora(
+    ds,
+    batch_size=1,        # Reduce from 2 to 1
+    resolution=512,      # Reduce from 1024 to 512
+)
+```
+
+### Slow Training
+
+**Solutions:**
+- Use a turbo model: `stabilityai/sdxl-turbo`
+- Reduce training steps
+- Enable mixed precision (coming in Phase 2)
+
+### Server Port Already in Use
+
+**Solution:** Use a different port
+```bash
+hypergen serve model-name --port 8001
+```
+
+---
+
+## Contributing
+
+Have a great example? Submit a PR!
+
+**Guidelines:**
+- Include a docstring explaining what it does
+- Keep it simple and focused
+- Add comments for complex parts
+- Test before submitting
diff --git a/examples/test_endpoint.py b/examples/test_endpoint.py
new file mode 100755
index 0000000..77abbcd
--- /dev/null
+++ b/examples/test_endpoint.py
@@ -0,0 +1,236 @@
+#!/usr/bin/env python3
+"""
+Quick test of the HyperGen serve endpoint.
+
+Prerequisites:
+1. Start the server first:
+   hypergen serve stabilityai/sdxl-turbo --port 8000
+
+2. Install requests:
+   pip install requests
+
+Usage:
+   python examples/test_endpoint.py
+"""
+import requests
+import base64
+from pathlib import Path
+import sys
+
+# Configuration
+BASE_URL = "http://localhost:8000"
+API_KEY = None  # Set this if your server requires authentication
+
+# Setup headers
+headers = {}
+if API_KEY:
+    headers["Authorization"] = f"Bearer {API_KEY}"
+
+
+def test_health():
+    """Test the health check endpoint."""
+    print("=" * 60)
+    print("Testing /health endpoint...")
+    print("=" * 60)
+
+    try:
+        response = requests.get(f"{BASE_URL}/health", headers=headers)
+        print(f"Status: {response.status_code}")
+        print(f"Response: {response.json()}")
+
+        if response.status_code == 200:
+            print("✓ Health check passed\n")
+            return True
+        else:
+            print("✗ Health check failed\n")
+            return False
+    except Exception as e:
+        print(f"✗ Error: {e}\n")
+        return False
+
+
+def test_list_models():
+    """Test the list models endpoint."""
+    print("=" * 60)
+    print("Testing /v1/models endpoint...")
+    print("=" * 60)
+
+    try:
+        response = requests.get(f"{BASE_URL}/v1/models", headers=headers)
+        print(f"Status: {response.status_code}")
+        data = response.json()
+        print(f"Response: {data}")
+
+        if response.status_code == 200:
+            print(f"✓ Found {len(data.get('data', []))} model(s)\n")
+            return True
+        else:
+            print("✗ List models failed\n")
+            return False
+    except Exception as e:
+        print(f"✗ Error: {e}\n")
+        return False
+
+
+def test_generate_image(prompt="a cute cat", save_path="test_output.png"):
+    """Test the image generation endpoint."""
+    print("=" * 60)
+    print("Testing /v1/images/generations endpoint...")
+    print("=" * 60)
+    print(f"Prompt: '{prompt}'")
+    print(f"Output: {save_path}")
+
+    try:
+        response = requests.post(
+            f"{BASE_URL}/v1/images/generations",
+            headers=headers,
+            json={
+                "prompt": prompt,
+                "n": 1,
+                "size": "512x512",
+                "num_inference_steps": 4,  # Fast for SDXL-turbo
+                "guidance_scale": 0.0,      # SDXL-turbo doesn't use guidance
+                "response_format": "b64_json"
+            },
+            timeout=120  # 2 minute timeout for generation
+        )
+
+        print(f"Status: {response.status_code}")
+
+        if response.status_code == 200:
+            data = response.json()
+            print(f"Created: {data.get('created', 'N/A')}")
+            print(f"Number of images: {len(data.get('data', []))}")
+
+            # Save the first image
+            if data.get('data'):
+                b64_data = data['data'][0]['b64_json']
+                image_bytes = base64.b64decode(b64_data)
+
+                output_path = Path(save_path)
+                output_path.write_bytes(image_bytes)
+
+                print(f"✓ Image saved to: {output_path}")
+                print(f"✓ Image size: {len(image_bytes):,} bytes")
+                print("✓ Image generation successful\n")
+                return True
+            else:
+                print("✗ No image data in response\n")
+                return False
+        else:
+            print(f"✗ Error: {response.text}\n")
+            return False
+
+    except requests.exceptions.Timeout:
+        print("✗ Request timed out (image generation took too long)\n")
+        return False
+    except Exception as e:
+        print(f"✗ Error: {e}\n")
+        return False
+
+
+def test_generate_with_url_format(prompt="a happy dog", save_path="test_output_url.png"):
+    """Test image generation with URL response format."""
+    print("=" * 60)
+    print("Testing /v1/images/generations with URL format...")
+    print("=" * 60)
+    print(f"Prompt: '{prompt}'")
+
+    try:
+        response = requests.post(
+            f"{BASE_URL}/v1/images/generations",
+            headers=headers,
+            json={
+                "prompt": prompt,
+                "n": 1,
+                "size": "512x512",
+                "num_inference_steps": 4,
+                "guidance_scale": 0.0,
+                "response_format": "url"  # URL format instead of base64
+            },
+            timeout=120
+        )
+
+        print(f"Status: {response.status_code}")
+
+        if response.status_code == 200:
+            data = response.json()
+            print(f"Number of images: {len(data.get('data', []))}")
+
+            if data.get('data'):
+                image_url = data['data'][0]['url']
+                print(f"✓ Image URL: {image_url}")
+
+                # Try to download the image
+                img_response = requests.get(image_url)
+                if img_response.status_code == 200:
+                    output_path = Path(save_path)
+                    output_path.write_bytes(img_response.content)
+                    print(f"✓ Image downloaded to: {output_path}")
+                    print(f"✓ Image size: {len(img_response.content):,} bytes\n")
+                    return True
+                else:
+                    print(f"✗ Failed to download image from URL\n")
+                    return False
+            else:
+                print("✗ No image data in response\n")
+                return False
+        else:
+            print(f"✗ Error: {response.text}\n")
+            return False
+
+    except Exception as e:
+        print(f"✗ Error: {e}\n")
+        return False
+
+
+def main():
+    """Run all tests."""
+    print("\n")
+    print("╔" + "═" * 58 + "╗")
+    print("║" + " " * 15 + "HyperGen Endpoint Test" + " " * 21 + "║")
+    print("╚" + "═" * 58 + "╝")
+    print()
+    print(f"Server: {BASE_URL}")
+    print(f"API Key: {'Set' if API_KEY else 'Not set'}")
+    print()
+
+    results = {
+        "Health Check": test_health(),
+        "List Models": test_list_models(),
+        "Generate Image (base64)": test_generate_image(),
+        "Generate Image (url)": test_generate_with_url_format(),
+    }
+
+    # Summary
+    print("=" * 60)
+    print("Test Summary")
+    print("=" * 60)
+
+    passed = sum(results.values())
+    total = len(results)
+
+    for test_name, result in results.items():
+        status = "✓ PASS" if result else "✗ FAIL"
+        print(f"{test_name:.<40} {status}")
+
+    print()
+    print(f"Results: {passed}/{total} tests passed")
+
+    if passed == total:
+        print("\n✓ All tests passed!")
+        return 0
+    else:
+        print(f"\n✗ {total - passed} test(s) failed")
+        return 1
+
+
+if __name__ == "__main__":
+    try:
+        sys.exit(main())
+    except KeyboardInterrupt:
+        print("\n\nTest interrupted by user")
+        sys.exit(1)
+    except Exception as e:
+        print(f"\n\nUnexpected error: {e}")
+        sys.exit(1)
diff --git a/pyproject.toml b/pyproject.toml
index 18bd6b4..06cd024 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -60,3 +60,6 @@ all = [
 [build-system]
 requires = ["hatchling"]
 build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/hypergen"]