diff --git a/docs_src/use-cases/order-accuracy/advanced.md b/docs_src/use-cases/order-accuracy/advanced.md
index a9516ca..2f1487d 100644
--- a/docs_src/use-cases/order-accuracy/advanced.md
+++ b/docs_src/use-cases/order-accuracy/advanced.md
@@ -1,144 +1,269 @@
-## Benchmark Quick Start command
-```bash
-make update-submodules
-```
-`update-submodules` ensures all submodules are initialized, updated to their latest remote versions, and ready for use.
+# Advanced Settings
+
+## Configuration Options
+
+### Local Image Building
+
+By default, the application uses pre-built Docker images for faster setup. If you need to build images locally (for customization or development):
 
 ```bash
-make benchmark-quickstart
+# Build and run locally instead of using pre-built images
+make build REGISTRY=false
+make up
+
+# Examples for both applications:
+# Dine-In
+cd dine-in && make build REGISTRY=false && make up
+
+# Take-Away
+cd take-away && make build REGISTRY=false && make up
 ```
-The above command would:<br>
-- Run headless (no display needed: `RENDER_MODE=0`)<br>
-- Pull pre-built images (`REGISTRY=true`)<br>
-- Target GPU by default (`DEVICE_ENV=res/all-gpu.env`)<br>
-- Generate benchmark metrics<br>
-- Run `make consolidate-metrics` automatically<br>
 
+**When to use local building:**
+- Modifying source code or configurations
+- Development and testing changes
+- Air-gapped environments without internet access
+- Custom hardware optimizations
 
-## Understanding Benchmarking Types
+**Note**: Local building takes significantly longer (15-30 minutes) compared to pre-built images (2-5 minutes).
 
-### Default benchmark command
+---
 
-```bash
-make update-submodules
-```
-`update-submodules` ensures all submodules are initialized, updated to their latest remote versions, and ready for use.
+## Dine-In Configuration
+
+### Environment Configuration (.env)
 
 ```bash
-make benchmark
+# =============================================================================
+# Logging
+# =============================================================================
+LOG_LEVEL=INFO
+
+# =============================================================================
+# Service Endpoints
+# =============================================================================
+OVMS_ENDPOINT=http://ovms-vlm:8000
+OVMS_MODEL_NAME=Qwen/Qwen2.5-VL-7B-Instruct
+SEMANTIC_SERVICE_ENDPOINT=http://semantic-service:8080
+API_TIMEOUT=60
 ```
-Runs with:<br>
-- `RENDER_MODE=0`<br>
-- `REGISTRY=true`<br>
-- `DEVICE_ENV=res/all-cpu.env`<br>
-- `PIPELINE_COUNT=1`<br>
 
-You can override these values through the following Environment Variables.
+### Test Data Configuration
 
-| Variable | Description | Values |
-|:----|:----|:---|
-|`RENDER_MODE` | for displaying pipeline and overlay CV metadata | 1, 0 |
-|`REGISTRY` | to pull pre-built images from public registry | false, true |
-|`PIPELINE_COUNT` | number of Loss Prevention Docker container instances to launch | Ex: 1 |
-|`DEVICE_ENV` | path to device specific environment file that will be loaded into the pipeline container | res/all-cpu.env, res/all-gpu.env, res/all-npu.env, res/all-dgpu.env |
+1. **Add Images**: Place food tray/plate images in `images/` folder
+   - Supported formats: `.jpg`, `.jpeg`, `.png`
+   - Images should clearly show food items on the plate
 
+2. **Update Orders**: Edit `configs/orders.json` with test orders
+   - Each order needs `image_id` and list of `items_ordered`
+   - Image IDs should match filenames (without extension)
 
+3. **Update Inventory**: Edit `configs/inventory.json` with menu items
+   - Define all possible food items
+   - Include item names, categories, and metadata
 
-### Benchmark command for GPU
+### Dine-In Docker Services
 
-```bash
-make DEVICE_ENV=res/all-gpu.env benchmark
-```
+| Container | Ports | Description |
+|-----------|-------|-------------|
+| `dinein_app` | 7861, 8083 | Main application (Gradio + FastAPI) |
+| `dinein_ovms_vlm` | 8002 | Vision-Language Model server |
+| `dinein_semantic_service` | 8081, 9091 | Semantic text matching |
+| `metrics-collector` | 8084 | System metrics aggregation |
 
-### Benchmark command for NPU
+---
 
-```bash
-make DEVICE_ENV=res/all-npu.env benchmark
-```
+## Take-Away Configuration
 
-### Benchmark command to build images locally
+### Environment Configuration (.env)
 
 ```bash
-make REGISTRY=false benchmark
+# =============================================================================
+# VLM Backend
+# =============================================================================
+VLM_BACKEND=ovms
+OVMS_ENDPOINT=http://ovms-vlm:8000
+OVMS_MODEL_NAME=Qwen/Qwen2.5-VL-7B-Instruct
+OPENVINO_DEVICE=GPU          # 'GPU', 'CPU', or 'AUTO'
+
+# =============================================================================
+# Semantic Service
+# =============================================================================
+SEMANTIC_VLM_BACKEND=ovms
+DEFAULT_MATCHING_STRATEGY=hybrid   # 'exact', 'semantic', or 'hybrid'
+SIMILARITY_THRESHOLD=0.85
+OVMS_TIMEOUT=60
+
+# =============================================================================
+# MinIO Storage
+# =============================================================================
+MINIO_ROOT_USER=minioadmin
+MINIO_ROOT_PASSWORD=minioadmin
+MINIO_ENDPOINT=minio:9000
 ```
 
-## See the benchmarking results.
+### Service Modes
 
-```sh
-make consolidate-metrics
-
-cat benchmark/metrics.csv
-```
+| Mode | Configuration | Use Case |
+|------|---------------|----------|
+| **Single** | `SERVICE_MODE=single` | Development, testing |
+| **Parallel** | `SERVICE_MODE=parallel WORKERS=4` | Production deployment |
 
+**Start in Different Modes:**
+```bash
+# Single mode (default)
+make up
 
+# Parallel mode with 4 workers
+make up-parallel WORKERS=4
 
+# Parallel mode with auto-scaling
+make up-parallel WORKERS=4 SCALING_MODE=auto
+```
 
+### Take-Away Docker Services
 
-## Benchmark Stream Density
+| Container | Ports | Description |
+|-----------|-------|-------------|
+| `takeaway_app` | 7860, 8080 | Main application (Gradio + FastAPI) |
+| `ovms-vlm` | 8001 | Vision-Language Model server |
+| `frame-selector` | 8085 | YOLO-based frame selection |
+| `semantic-service` | 8081, 9091 | Semantic text matching |
+| `minio` | 9000, 9001 | S3-compatible storage |
 
-To test the maximum amount of Order Accuracy containers/pipelines that can run on a given system you can use the TARGET_FPS environment variable. Default is to find the container threshold over 7.95 FPS with the run-pipeline.sh pipeline. You can override these values through Environment Variables.
+---
 
-List of EVs:
+## Benchmarking
 
- | Variable | Description | Values |
- |:----|:----|:---|
- |`TARGET_FPS` | threshold value for FPS to consider a valid stream | Ex. 7.95 |
- |`OOM_PROTECTION` | flag to enable/disable OOM checks before scaling the pipeline (enabled by default) | 1, 0 |
+### Dine-In Benchmarking
 
-> **Note:**
-> 
-> An OOM crash occurs when a system or application tries to use more memory (RAM) than is available, causing the operating system to forcibly terminate processes to free up memory.<br>
-> If `OOM_PROTECTION` is set to 0, the system may crash or become unresponsive, requiring a hard reboot. 
-    
+**Initialize Performance Tools:**
 ```bash
-make benchmark-stream-density
+cd dine-in
+make update-submodules
 ```
 
-You can check the output results for performance metrics in the `results` folder at the root level. Also, the stream density script will output the results in the console:
+**Run Benchmark:**
+```bash
+make benchmark
+```
 
+**Stream Density Test:**
+```bash
+make benchmark-density
+```
 
+**View Results:**
+```bash
+make benchmark-density-results
+cat results/benchmark_results.json
+```
 
-### Change the Target FPS value:
+### Take-Away Benchmarking
 
+**Initialize Performance Tools:**
 ```bash
-make TARGET_FPS=6.5 benchmark-stream-density
+cd take-away
+make update-submodules
 ```
 
+**Single Video Benchmark:**
+```bash
+make benchmark
+```
 
-Alternatively you can directly call the benchmark.py. This enables you to take advantage of all performance tools parameters. More details about the performance tools can be found [HERE](../../performance-tools/benchmark.md#benchmark-stream-density-for-cv-pipelines)
+**Fixed Workers Benchmark:**
+```bash
+make benchmark-oa BENCHMARK_WORKERS=4 BENCHMARK_DURATION=300
+```
 
+**Stream Density Benchmark:**
 ```bash
-cd performance-tools/benchmark-scripts && python benchmark.py --compose_file ../../src/docker-compose.yml --target_fps 7
+make benchmark-stream-density \
+  BENCHMARK_TARGET_LATENCY_MS=25000 \
+  BENCHMARK_MIN_TRANSACTIONS=3 \
+  BENCHMARK_WORKER_INCREMENT=1
 ```
 
+### Benchmark Configuration Variables
 
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `TARGET_LATENCY_MS` | 25000 | Target latency threshold (ms) |
+| `LATENCY_METRIC` | avg | 'avg', 'p95', or 'max' |
+| `WORKER_INCREMENT` | 1 | Workers added per iteration |
+| `INIT_DURATION` | 10 | Warmup time (seconds) |
+| `MIN_TRANSACTIONS` | 3 | Min transactions before measuring |
+| `MAX_ITERATIONS` | 50 | Max scaling iterations |
+| `RESULTS_DIR` | ./results | Results output directory |
 
+---
 
+## System Requirements
 
+### Minimum Configuration
 
+| Component | Specification |
+|-----------|---------------|
+| CPU | Intel Xeon 8+ cores |
+| RAM | 16 GB |
+| GPU | Intel Arc A770 8GB / NVIDIA RTX 3060 |
+| Storage | 50 GB SSD |
+| Docker | 24.0+ with Compose V2 |
 
-## 🛠️ Other Useful Make Commands.
+### Recommended Configuration
 
-- `make clean-images` — Remove dangling Docker images
-- `make clean-models` — Remove all the downloaded models from the system
-- `make clean-all` — Remove all unused Docker resources
+| Component | Specification |
+|-----------|---------------|
+| CPU | Intel Xeon 16+ cores |
+| RAM | 32 GB |
+| GPU | NVIDIA RTX 3080+ / Intel Data Center GPU |
+| Storage | 200 GB NVMe SSD |
+| Network | 10 Gbps (for Take-Away RTSP) |
 
-## 📁 Project Structure
+---
 
-- `configs/` — Configuration files (workload videos URLs)
-- `docker/` — Dockerfiles for downloader and pipeline containers
-- `download-scripts/` — Scripts for downloading models and videos
-- `src/` — Main source code and pipeline runner scripts
-- `Makefile` — Build automation and workflow commands
+## Useful Make Commands
 
----
+### Dine-In Commands
 
+```bash
+make build                  # Build Docker images
+make up                     # Start services
+make down                   # Stop services
+make logs                   # View logs
+make update-submodules      # Initialize performance-tools
+make benchmark              # Run benchmark
+make benchmark-density      # Run stream density test
+```
 
-## Configure the system proxy
+### Take-Away Commands
 
-Please follow the below steps to configure the proxy
+```bash
+make build                  # Build Docker images
+make up                     # Start (single mode)
+make up-parallel WORKERS=4  # Start (parallel mode)
+make down                   # Stop services
+make logs                   # View logs
+make update-submodules      # Initialize performance-tools
+make benchmark              # Run benchmark
+make benchmark-stream-density  # Stream density test
+```
 
-### 1. Configure Proxy for the Current Shell Session
+### Common Commands
+
+- `make clean-images` — Remove dangling Docker images
+- `make clean-all` — Remove all unused Docker resources
+- `make check-env` — Verify configuration
+- `make show-config` — Display current configuration
+
+---
+
+## Configure System Proxy
+
+Please follow these steps to configure proxy settings:
+
+### 1. Configure Proxy for Current Shell Session
 
 ```bash
 export http_proxy=http://<proxy-host>:<port>
@@ -147,60 +272,63 @@ export HTTP_PROXY=http://<proxy-host>:<port>
 export HTTPS_PROXY=http://<proxy-host>:<port>
 export NO_PROXY=localhost,127.0.0.1,::1
 export no_proxy=localhost,127.0.0.1,::1
-export socks_proxy=http://<proxy-host>:<port>
-export SOCKS_PROXY=http://<proxy-host>:<port>
 ```
 
-### 2. System-Wide Proxy Configuration
-
-System-wide environment (/etc/environment)
-(Run: sudo nano /etc/environment and add or update)
+### 2. Docker Daemon Proxy Configuration
 
+Create directory if missing:
 ```bash
-http_proxy=http://<proxy-host>:<port>
-https_proxy=http://<proxy-host>:<port>
-ftp_proxy=http://<proxy-host>:<port>
-socks_proxy=http://<proxy-host>:<port>
-no_proxy=localhost,127.0.0.1,::1
-
-HTTP_PROXY=http://<proxy-host>:<port>
-HTTPS_PROXY=http://<proxy-host>:<port>
-FTP_PROXY=http://<proxy-host>:<port>
-SOCKS_PROXY=http://<proxy-host>:<port>
-NO_PROXY=localhost,127.0.0.1,::1
-```
-### 3. Docker Daemon & Client Proxy Configuration
-
-Docker daemon drop-in (/etc/systemd/system/docker.service.d/http-proxy.conf)
-Create dir if missing:
 sudo mkdir -p /etc/systemd/system/docker.service.d
 sudo nano /etc/systemd/system/docker.service.d/http-proxy.conf
+```
 
-```bash
+Add configuration:
+```ini
 [Service]
 Environment="http_proxy=http://<proxy-host>:<port>"
 Environment="https_proxy=http://<proxy-host>:<port>"
 Environment="no_proxy=localhost,127.0.0.1,::1"
-Environment="HTTP_PROXY=http://<proxy-host>:<port>"
-Environment="HTTPS_PROXY=http://<proxy-host>:<port>"
-Environment="NO_PROXY=localhost,127.0.0.1,::1"
-Environment="socks_proxy=http://<proxy-host>:<port>"
-Environment="SOCKS_PROXY=http://<proxy-host>:<port>"
+```
 
-# Reload & restart:
+Reload and restart:
+```bash
 sudo systemctl daemon-reload
 sudo systemctl restart docker
+```
+
+---
+
+## Troubleshooting
+
+### Common Issues
 
-#  Docker client config (~/.docker/config.json)
-#  mkdir -p ~/.docker
-#  nano ~/.docker/config.json
-{
-  "proxies": {
-    "default": {
-      "httpProxy": "http://<proxy-host>:<port>",
-      "httpsProxy": "http://<proxy-host>:<port>",
-      "noProxy": "localhost,127.0.0.1,::1"
-    }
-  }
-}
-```
\ No newline at end of file
+**OVMS Not Loading:**
+- Ensure GPU drivers are installed
+- Check model files exist in `ovms-service/models/`
+- Verify OVMS endpoint in `.env`
+
+**VLM Timeout Errors:**
+- Increase `API_TIMEOUT` in `.env`
+- Check GPU memory utilization
+- Consider using a smaller model precision (INT8)
+
+**Stream Processing Issues (Take-Away):**
+- Verify RTSP stream URLs are accessible
+- Check network bandwidth
+- Consider reducing number of parallel workers
+
+### Debug Commands
+
+```bash
+# Check container logs
+docker logs <container_name>
+
+# Check GPU utilization
+nvidia-smi -l 1
+
+# Check network connectivity
+curl http://localhost:8001/v2/models
+
+# Verify service health
+curl http://localhost:8083/health
+```
diff --git a/docs_src/use-cases/order-accuracy/architecture.md b/docs_src/use-cases/order-accuracy/architecture.md
new file mode 100644
index 0000000..9d725b1
--- /dev/null
+++ b/docs_src/use-cases/order-accuracy/architecture.md
@@ -0,0 +1,356 @@
+# Order Accuracy System Architecture
+
+## Table of Contents
+1. [System Overview](#system-overview)
+2. [Architecture Diagrams](#architecture-diagrams)
+3. [Component Details](#component-details)
+4. [Data Flow](#data-flow)
+5. [Production Features](#production-features)
+
+## System Overview
+
+The Order Accuracy platform is an enterprise AI vision system designed for real-time order validation in quick-service restaurant (QSR) environments. The system uses Vision Language Models (VLM) to analyze images or video feeds, automatically identifying items and validating them against order data.
+
+### Key Features
+- **VLM-Powered Detection**: Uses Qwen2.5-VL-7B for accurate item identification
+- **Intel Hardware Optimization**: Optimized for Intel CPUs and GPUs via OpenVINO
+- **Dual Application Support**: Dine-In (image-based) and Take-Away (video stream-based)
+- **Semantic Matching**: Fuzzy matching for item name variations
+- **Real-time Processing**: Sub-15-second validation for operational efficiency
+- **Containerized Deployment**: Docker-based deployment with microservices architecture
+
+## Architecture Diagrams
+
+### Platform Architecture
+
+```mermaid
+graph TB
+    subgraph "Order Accuracy Platform"
+        subgraph "Dine-In Application"
+            DUI[Gradio UI :7861]
+            DAPI[FastAPI API :8083]
+            DVLM[VLM Client]
+            DSEM[Semantic Client]
+        end
+        
+        subgraph "Take-Away Application"
+            TUI[Gradio UI :7860]
+            TAPI[FastAPI API :8080]
+            TSW[Station Workers]
+            TVS[VLM Scheduler]
+            TFS[Frame Selector]
+        end
+        
+        subgraph "Shared Services"
+            OVMS[OVMS VLM<br>Qwen2.5-VL-7B]
+            SEM[Semantic Service]
+            MINIO[MinIO Storage]
+        end
+    end
+    
+    DUI --> DAPI
+    DAPI --> DVLM
+    DAPI --> DSEM
+    DVLM --> OVMS
+    DSEM --> SEM
+    
+    TUI --> TAPI
+    TAPI --> TSW
+    TSW --> TFS
+    TSW --> TVS
+    TVS --> OVMS
+    TFS --> MINIO
+```
+
+### Dine-In Architecture
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                           DINE-IN ORDER ACCURACY                            │
+├─────────────────────────────────────────────────────────────────────────────┤
+│                                                                             │
+│  ┌─────────────┐      ┌──────────────────┐      ┌─────────────────────┐    │
+│  │             │      │                  │      │                     │    │
+│  │  Gradio UI  │─────▶│   FastAPI API    │─────▶│   Validation        │    │
+│  │  (Port 7861)│      │   (Port 8083)    │      │   Service           │    │
+│  │             │      │                  │      │                     │    │
+│  └─────────────┘      └────────┬─────────┘      └──────────┬──────────┘    │
+│                                │                           │               │
+│                    ┌───────────┴───────────┐               │               │
+│                    │                       │               │               │
+│                    ▼                       ▼               ▼               │
+│           ┌────────────────┐     ┌─────────────────┐ ┌───────────────┐    │
+│           │  VLM Client    │     │ Semantic Client │ │ Metrics       │    │
+│           │  (Circuit      │     │ (Circuit        │ │ Collector     │    │
+│           │   Breaker)     │     │  Breaker)       │ │               │    │
+│           └───────┬────────┘     └────────┬────────┘ └───────────────┘    │
+│                   │                       │                               │
+└───────────────────┼───────────────────────┼───────────────────────────────┘
+                    │                       │
+                    ▼                       ▼
+          ┌─────────────────┐     ┌─────────────────┐
+          │   OVMS VLM      │     │   Semantic      │
+          │   (Qwen2.5-VL)  │     │   Service       │
+          └─────────────────┘     └─────────────────┘
+```
+
+### Take-Away Architecture
+
+```
+┌──────────────────────────────────────────────────────────────────────────────────────┐
+│                            TAKE-AWAY ORDER ACCURACY SYSTEM                           │
+│                                                                                      │
+│  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐                  │
+│  │    RTSP Video   │    │  Frame Selector │    │      Order      │                  │
+│  │     Streams     │───▶│     Service     │───▶│    Accuracy     │                  │
+│  │   (GStreamer)   │    │     (YOLO)      │    │    Service      │                  │
+│  └─────────────────┘    └─────────────────┘    └────────┬────────┘                  │
+│                                                          │                           │
+│         ┌────────────────────────────────────────────────┤                           │
+│         │                                                │                           │
+│         ▼                                                ▼                           │
+│  ┌─────────────────┐                          ┌─────────────────┐                   │
+│  │   VLM Scheduler │                          │    Validation   │                   │
+│  │   (Batcher)     │                          │      Agent      │                   │
+│  └────────┬────────┘                          └────────┬────────┘                   │
+│           │                                            │                             │
+│           ▼                                            ▼                             │
+│  ┌─────────────────┐                          ┌─────────────────┐                   │
+│  │    OVMS VLM     │                          │    Semantic     │                   │
+│  │  (Qwen2.5-VL)   │                          │    Service      │                   │
+│  └─────────────────┘                          └─────────────────┘                   │
+│                                                                                      │
+└──────────────────────────────────────────────────────────────────────────────────────┘
+```
+
+## Component Details
+
+### Core Components
+
+#### 1. VLM Backend (OVMS)
+
+OpenVINO Model Server hosting Qwen2.5-VL-7B for vision-language inference.
+
+**Features:**
+- OpenAI-compatible API (`/v3/chat/completions`)
+- INT8 quantization for optimized performance
+- GPU acceleration via Intel/NVIDIA hardware
+- Shared model instance for both applications
+
+**API Usage:**
+```python
+response = requests.post(
+    f"{OVMS_ENDPOINT}/v3/chat/completions",
+    json={
+        "model": "Qwen/Qwen2.5-VL-7B-Instruct",
+        "messages": [
+            {
+                "role": "user",
+                "content": [
+                    {"type": "text", "text": prompt},
+                    {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{img_b64}"}}
+                ]
+            }
+        ]
+    }
+)
+```
+
+#### 2. Semantic Comparison Service
+
+AI-powered semantic matching microservice for intelligent item comparison.
+
+**Matching Strategies:**
+- **Exact**: Direct string comparison
+- **Semantic**: Vector similarity using sentence-transformers
+- **Hybrid**: Exact first, then semantic fallback
+
+**Example Matches:**
+- "Big Mac" ↔ "Maharaja Mac" (regional name variant)
+- "green apple" ↔ "apple" (partial match)
+- "large fries" ↔ "french fries large" (word reordering)
+
+#### 3. Frame Selector Service (Take-Away)
+
+YOLO-based intelligent frame selection for optimal VLM input.
+
+**Process:**
+1. Receive raw video frames from GStreamer pipeline
+2. Run YOLO object detection on each frame
+3. Score frames by item visibility and clarity
+4. Select top K frames per order
+5. Store selected frames in MinIO
+
+#### 4. VLM Scheduler (Take-Away)
+
+Request batching scheduler optimizing OVMS throughput.
+
+**Batching Strategy:**
+- Time Window: 50-100ms collection period
+- Max Batch Size: Configurable (default: 16)
+- Fair Scheduling: Round-robin across workers
+- Response Routing: Match responses to original requesters
+
+### Docker Services
+
+#### Dine-In Services
+
+| Container | Ports | Description |
+|-----------|-------|-------------|
+| `dinein_app` | 7861, 8083 | Main application (Gradio + FastAPI) |
+| `dinein_ovms_vlm` | 8002 | Vision-Language Model server |
+| `dinein_semantic_service` | 8081 | Semantic text matching |
+| `metrics-collector` | 8084 | System metrics aggregation |
+
+#### Take-Away Services
+
+| Container | Ports | Description |
+|-----------|-------|-------------|
+| `takeaway_app` | 7860, 8080 | Main application (Gradio + FastAPI) |
+| `ovms-vlm` | 8001 | Vision-Language Model server |
+| `frame-selector` | 8085 | YOLO-based frame selection |
+| `semantic-service` | 8081 | Semantic text matching |
+| `minio` | 9000, 9001 | S3-compatible storage |
+| `rtsp-streamer` | 8554 | RTSP stream simulator (testing) |
+
+## Data Flow
+
+### Dine-In Validation Pipeline
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                        VALIDATION PIPELINE                          │
+├─────────────────────────────────────────────────────────────────────┤
+│                                                                     │
+│  1. IMAGE PREPROCESSING                                            │
+│     Raw Image → Auto-Orient → Resize (672px) → Enhance →           │
+│     Sharpen → JPEG Compress (82%) → Base64 Encode                  │
+│                              │                                      │
+│                              ▼                                      │
+│  2. VLM INFERENCE                                                  │
+│     Prompt: "Analyze this food plate image..."                     │
+│     + Inventory list for context                                   │
+│     → OVMS POST /v3/chat/completions                               │
+│     → Parse JSON response for detected items                       │
+│                              │                                      │
+│                              ▼                                      │
+│  3. SEMANTIC MATCHING                                              │
+│     For each expected item:                                        │
+│       Find best match in detected items (similarity > 0.7)         │
+│       Track: matched, missing, extra, quantity mismatches          │
+│                              │                                      │
+│                              ▼                                      │
+│  4. RESULT AGGREGATION                                             │
+│     {                                                              │
+│       "order_complete": true/false,                                │
+│       "accuracy_score": 0.0-1.0,                                   │
+│       "missing_items": [...],                                      │
+│       "extra_items": [...]                                         │
+│     }                                                              │
+│                                                                     │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+### Take-Away Processing Pipeline
+
+```
+┌────────────────────────────────────────────────────────────────────────────┐
+│                         DATA FLOW PIPELINE                                  │
+│                                                                             │
+│  1. VIDEO CAPTURE                                                          │
+│     RTSP Camera ──▶ GStreamer Pipeline ──▶ Frame Buffer                   │
+│                                                                             │
+│  2. FRAME SELECTION                                                        │
+│     Frame Selector (YOLO):                                                 │
+│     • Object detection on raw frames                                       │
+│     • Score frames by item visibility                                      │
+│     • Select top K frames per order                                        │
+│     • Store selected frames in MinIO                                       │
+│                                                                             │
+│  3. VLM PROCESSING                                                         │
+│     VLM Scheduler → OVMS (Qwen2.5-VL):                                    │
+│     • Batch frames by time window                                          │
+│     • Send to OVMS with detection prompt                                   │
+│     • Parse structured item response                                       │
+│                                                                             │
+│  4. ORDER VALIDATION                                                       │
+│     Validation Agent:                                                      │
+│     • Compare detected items with expected order                           │
+│     • Exact match → Semantic match → Flag mismatch                        │
+│     • Generate validation result                                           │
+│                                                                             │
+│  5. RESULT OUTPUT                                                          │
+│     { "matched": [...], "missing": [...], "extra": [...] }                │
+│                                                                             │
+└────────────────────────────────────────────────────────────────────────────┘
+```
+
+## Production Features
+
+### Circuit Breaker Pattern
+
+Prevents cascading failures when external services are unhealthy.
+
+```
+                    5 consecutive failures
+       ┌────────┐ ───────────────────────────▶ ┌────────┐
+       │        │                               │        │
+       │ CLOSED │                               │  OPEN  │
+       │        │ ◀─────────────────────────── │        │
+       └────────┘    2 successes in half_open  └────────┘
+            ▲                                       │
+            │                                       │
+            │        ┌────────────┐                 │
+            └────────│ HALF_OPEN  │◀────────────────┘
+      2 successes    └────────────┘    30s timeout
+                           │
+                           │ 1 failure → Back to OPEN
+```
+
+**Configuration:**
+- VLM Client: 5 failures → OPEN, 30s recovery → HALF_OPEN
+- Semantic Client: 15s recovery timeout (faster than VLM)
+
+### Connection Pooling
+
+```python
+# VLM Client Pool Configuration
+limits = httpx.Limits(
+    max_keepalive_connections=20,
+    max_connections=50,
+    keepalive_expiry=30.0
+)
+timeout = httpx.Timeout(
+    connect=10.0,
+    read=300.0,   # Extended for VLM inference
+    write=10.0,
+    pool=10.0
+)
+```
+
+### Bounded Cache (LRU)
+
+Thread-safe LRU cache with automatic eviction to prevent memory exhaustion:
+- Maximum 10,000 entries
+- Automatic eviction of oldest entries when full
+- Thread-safe operations with locking
+
+### Station Worker Reliability (Take-Away)
+
+| Feature | Implementation |
+|---------|----------------|
+| GStreamer Pipeline | RTSP → H.264 decode → Frame capture |
+| Circuit Breaker | 5 failures in 5 min → 30s cooldown |
+| Exponential Backoff | 2s → 4s → 8s → ... → 60s max |
+| Stall Detection | No frames for 5 min triggers restart |
+| Health Monitoring | Frame rate, pipeline state tracking |
+
+### Performance Characteristics
+
+| Metric | Dine-In | Take-Away |
+|--------|---------|-----------|
+| **End-to-End Latency** | 8-15 seconds | Real-time stream |
+| **VLM Inference** | 5-10 seconds | 5-10 seconds (batched) |
+| **Semantic Matching** | 50-200ms | 50-200ms |
+| **Throughput** | ~4-6 req/min | Multiple concurrent streams |
+| **GPU Utilization** | 60-80% | 70-90% (parallel mode) |
diff --git a/docs_src/use-cases/order-accuracy/getting_started.md b/docs_src/use-cases/order-accuracy/getting_started.md
index 97fc173..d062f31 100644
--- a/docs_src/use-cases/order-accuracy/getting_started.md
+++ b/docs_src/use-cases/order-accuracy/getting_started.md
@@ -1,117 +1,193 @@
 # Getting Started
 
-### **NOTE:** 
+## 📋 Prerequisites
 
-By default the application runs by pulling the pre-built images. If you want to build the images locally and then run the application, set the flag:
+- Ubuntu 24.04 or newer (Linux recommended), Desktop edition (or Server edition with GUI installed)
+- [Docker](https://docs.docker.com/engine/install/) 24.0+
+- [Docker Compose](https://docs.docker.com/compose/install/) V2+
+- [Make](https://www.gnu.org/software/make/) (`sudo apt install make`)
+- Intel hardware (CPU, iGPU, dGPU)
+- Intel drivers:
+    - [Intel GPU drivers](https://dgpu-docs.intel.com/driver/client/overview.html)
+- Sufficient disk space for models, videos, and results (50GB minimum)
 
-```bash
-REGISTRY=false
+!!! note
+    First-time setup downloads AI models (~7GB) and Docker images - this may take 30-60 minutes depending on your internet connection.
 
-usage: make <command> REGISTRY=false (applicable for all commands like benchmark, benchmark-stream-density..)
-Example: make run-demo REGISTRY=false
-```
+## Choose Your Application
 
-(If this is the first time, it will take some time to download videos, models, docker images and build images)
+### 🍽️ Dine-In Order Accuracy
+**Purpose**: Validate food plates at serving stations before delivery to tables  
+**Use When**: You need image-based validation for restaurant table service  
+**Input**: Static images of food trays/plates  
+**Features**: Gradio web interface, REST API for POS integration
 
-## Step by step instructions:
+### 🥡 Take-Away Order Accuracy
+**Purpose**: Real-time order validation for drive-through and counter service  
+**Use When**: You need continuous video stream validation at multiple stations  
+**Input**: RTSP video streams  
+**Features**: Multi-station parallel processing, VLM request batching
 
-1. Download the models using download_models/downloadModels.sh
+## Quick Start Reference
 
-    ```bash
-    make download-models
-    ```
+### 🍽️ Dine-In Quick Commands
 
-2. Update github submodules
+| Configuration | Command | Description |
+|---------------|---------|-------------|
+| **Start Services** | `make up` | Start all dine-in services |
+| **Build Locally** | `make build REGISTRY=false` | Build images from source |
+| **View Logs** | `make logs` | View service logs |
+| **Stop Services** | `make down` | Stop all containers |
 
-    ```bash
-    make update-submodules
-    ```
+### 🥡 Take-Away Quick Commands
 
-3. Download sample videos used by the performance tools
+| Configuration | Command | Description |
+|---------------|---------|-------------|
+| **Single Mode** | `make up` | Start in single worker mode (development) |
+| **Parallel Mode** | `make up-parallel WORKERS=4` | Start with 4 parallel workers (production) |
+| **Build Locally** | `make build REGISTRY=false` | Build images from source |
+| **View Logs** | `make logs` | View service logs |
 
-    ```bash
-    make download-sample-videos
-    ```
+!!! tip
+    **Single Mode** is best for development and testing. **Parallel Mode** is recommended for production with multiple camera stations.
+
+## Step-by-Step Instructions
 
-4. Run the order accuracy application
+### Option 1: Dine-In Order Accuracy
 
+1. **Clone the Repository**
     ```bash
-    make run-render-mode
+    git clone -b <release-or-tag> --single-branch https://github.com/intel-retail/order-accuracy
     ```
-
-- The above series of commands can be executed using only one command:
-    
-  ```bash
-  make run-demo
-  ```
-
-5. To build the images locally step by step:
-- Follow the following steps:
-  ```bash
-  make download-models REGISTRY=false
-  make update-submodules REGISTRY=false
-  make download-sample-videos
-  make run-render-mode REGISTRY=false
-  ```
-- The above series of commands can be executed using only one command:
+    >Replace `<release-or-tag>` with the version you want to clone (for example, **v2026.0**).
     ```bash
-    make run-demo REGISTRY=false
+    git clone -b v2026.0 --single-branch https://github.com/intel-retail/order-accuracy
+    cd order-accuracy/dine-in
     ```
 
-6. Verify Docker containers
-
+2. **Setup OVMS Models (First Time Only)**
     ```bash
-    docker ps --all
+    cd ../ovms-service
+    ./setup_models.sh
+    cd ../dine-in
     ```
-    Result:
+    This downloads and converts the Qwen2.5-VL-7B model (~7GB). This only needs to be done once.
+
+3. **Prepare Test Data**
+    - Add your food tray/plate images to the `images/` folder
+    - Update `configs/orders.json` with test orders
+    - Update `configs/inventory.json` with your menu items
+
+4. **Build and Start Services**
     ```bash
-    NAMES                    STATUS                     IMAGE
-    src-ClientGst-1          Up 17 seconds (healthy)    dlstreamer:dev
-    model-downloader         Exited(0) 17 seconds       model-downloader:latest
+    # Using pre-built images (recommended for first run)
+    make build
+    make up
+    
+    # OR build locally from source
+    make build REGISTRY=false
+    make up
     ```
 
-7. Verify Results
+5. **Access the Application**
+    - **Gradio UI**: http://localhost:7861
+    - **REST API Docs**: http://localhost:8083/docs
 
-    After starting Order Accuracy you will begin to see result files being written into the results/ directory. Here are example outputs from the 3 log files.
+---
 
-    gst-launch_<time>_gst.log
-    ```
-    /GstPipeline:pipeline0/GstGvaWatermark:gvawatermark0/GstCapsFilter:capsfilter1: caps = video/x-raw(memory:VASurface), format=(string)RGBA
-    /GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstXImageSink:ximagesink0: sync = true
-    Got context from element 'vaapipostproc1': gst.vaapi.Display=context, gst.vaapi.Display=(GstVaapiDisplay)"\(GstVaapiDisplayGLX\)\ vaapidisplayglx0", gst.vaapi.Display.GObject=(GstObject)"\(GstVaapiDisplayGLX\)\ vaapidisplayglx0";
-    Progress: (open) Opening Stream
-    Pipeline is PREROLLED ...
-    Prerolled, waiting for progress to finish...
-    Progress: (connect) Connecting to rtsp://localhost:8554/camera_0
-    Progress: (open) Retrieving server options
-    Progress: (open) Retrieving media info
-    Progress: (request) SETUP stream 0
-    ```
+### Option 2: Take-Away Order Accuracy
 
-    pipeline<time>_gst.log
+1. **Clone the Repository**
+    ```bash
+    git clone -b <release-or-tag> --single-branch https://github.com/intel-retail/order-accuracy
     ```
-    14.58
-    14.58
-    15.47
-    15.47
-    15.10
-    15.10
-    14.60
-    14.60
-    14.88
-    14.88
+    >Replace `<release-or-tag>` with the version you want to clone (for example, **v2026.0**).
+    ```bash
+    git clone -b v2026.0 --single-branch https://github.com/intel-retail/order-accuracy
+    cd order-accuracy/take-away
     ```
 
-    r<time>_gst.jsonl
+2. **Setup OVMS Models (First Time Only)**
+    ```bash
+    cd ../ovms-service
+    ./setup_models.sh
+    cd ../take-away
+    ```
+    This downloads the VLM and EasyOCR models. This only needs to be done once.
 
-8. Stop the containers:
-    When pre-built images are pulled-
+3. **Initialize Environment**
     ```bash
-    make down
+    make init-env
     ```
-    When images are built locally-
+
+4. **Build and Start Services**
     ```bash
-    make down REGISTRY=false
+    # Single worker mode (development/testing)
+    make build
+    make up
+    
+    # OR Parallel mode (production)
+    make build
+    make up-parallel WORKERS=4
     ```
 
-## [Proceed to Advanced Settings](advanced.md)    
+5. **Access the Application**
+    - **Gradio UI**: http://localhost:7860
+    - **MinIO Console**: http://localhost:9001 (minioadmin/minioadmin)
+
+## What You'll See When Working
+
+### 🍽️ Dine-In Results
+- **Visual**: Gradio UI displays detected items with confidence scores
+- **Validation**: Order match/mismatch status with detailed comparison
+- **Logs**: Detection results and semantic matching scores
+
+### 🥡 Take-Away Results
+- **Visual**: Real-time frame processing with item detection
+- **Validation**: Continuous order validation against POS data
+- **Storage**: Frames and results stored in MinIO buckets
+
+### Expected Performance
+- **Startup Time**: 2-5 minutes (first run includes model loading)
+- **Processing**: Sub-15-second validation latency (Dine-In), real-time stream processing (Take-Away)
+- **Results**: JSON files appear in `results/` directory
+
+## Verify Results
+
+After starting the application, you can verify it's working:
+
+**Dine-In:**
+```bash
+# Check running containers
+docker ps
+
+# View application logs
+make logs
+
+# Test the API
+curl http://localhost:8083/health
+```
+
+**Take-Away:**
+```bash
+# Check running containers
+docker ps
+
+# View application logs
+make logs
+
+# Check MinIO storage
+# Visit http://localhost:9001
+```
+
+## Stop the Services
+
+```bash
+# Stop all services
+make down
+
+# Stop and remove volumes (clean restart)
+make down-volumes
+```
+
+## [Proceed to Advanced Settings](advanced.md)
diff --git a/docs_src/use-cases/order-accuracy/order-accuracy.md b/docs_src/use-cases/order-accuracy/order-accuracy.md
index d7c2b48..46e786d 100644
--- a/docs_src/use-cases/order-accuracy/order-accuracy.md
+++ b/docs_src/use-cases/order-accuracy/order-accuracy.md
@@ -2,12 +2,128 @@
 
 ## Overview
 
-As computer vision technology becomes more mainstream in industrial and retail settings, using it for order accuracy in quick service restaurants is becoming increasingly complex. These vision workloads are substantial and require multiple stages of processing. For example, a typical order accuracy pipeline might capture video data and use an object detection model like YOLO V11 to identify items such as sandwiches, fries, and drinks placed on trays or tables. The results are then post-processed to generate metadata highlighting which food items were served.
+Order Accuracy is an enterprise AI vision platform that validates food orders in real-time using Vision Language Models (VLM). The platform automatically detects items in food trays, bags, or containers, compares them against expected order data, and identifies discrepancies before orders reach customers.
 
-Implementing order accuracy solutions in retail isn't straightforward. Retailers, independent software vendors (ISVs), and system integrators (SIs) need a solid understanding of both hardware and software, as well as the costs involved in setting up and scaling these systems. Given the data-intensive nature of vision workloads, systems must be carefully designed, built, and deployed with numerous considerations in mind. Effectively combating shrinkage requires the right mix of hardware, software, and optimized configurations.
+As computer vision technology becomes more mainstream in industrial and retail settings, using it for order accuracy in quick service restaurants is becoming increasingly complex. These vision workloads are substantial and require multiple stages of processing. The Intel® Order Accuracy Reference Package provides the essential components needed to develop and deploy order accuracy solutions using Intel® hardware, software, and open-source tools.
 
-The Intel® Order Accuracy Reference Package is designed to help with this. It provides the essential components needed to develop and deploy a order accuracy  solution using Intel® hardware, software, and open-source tools. This reference implementation includes a pre-configured pipeline that's optimized for Intel® hardware, simplifying the setup of an effective computer vision-based order accuracy system for retailers.
+## Platform Applications
 
-## Next Steps
+The platform provides two specialized applications optimized for different restaurant scenarios:
 
-To begin using the order accuracy solution you can follow the [Getting Started Guide](./getting_started.md). 
+| Application | Use Case | Input Type |
+|-------------|----------|------------|
+| **[Dine-In](#dine-in-order-accuracy)** | Restaurant table service validation | Static images |
+| **[Take-Away](#take-away-order-accuracy)** | Drive-through and counter service | Video streams (RTSP) |
+
+---
+
+## Dine-In Order Accuracy
+
+**Image-based order validation for restaurant dining applications**
+
+Optimized for validating food trays at serving stations before delivery to tables. Uses single image capture and VLM analysis for fast, accurate item detection.
+
+### Key Features
+
+- Single image capture and analysis
+- Food tray/plate item detection
+- REST API for POS integration
+- Gradio web interface for manual validation
+- Hybrid semantic matching
+- Zero-training deployment with pre-trained Qwen2.5-VL-7B model
+
+### Use Case
+
+In a full-service restaurant:
+1. Kitchen prepares a dish for Table 12
+2. Expo staff places the plate in the validation station
+3. Staff triggers validation via Gradio UI or API
+4. System analyzes plate contents using VLM
+5. System compares detected items against the order manifest
+6. Staff receives immediate feedback on order accuracy
+
+---
+
+## Take-Away Order Accuracy
+
+**Real-time video stream validation for drive-through and counter service**
+
+Optimized for high-throughput drive-through environments with multiple camera stations. Processes RTSP video streams in parallel using intelligent frame selection and VLM batching.
+
+### Key Features
+
+- Real-time RTSP video stream processing
+- Multi-station parallel processing (up to 8 workers)
+- GStreamer-based video pipeline
+- YOLO-powered frame selection
+- VLM request batching for throughput
+- Circuit breaker and auto-recovery
+
+### Service Modes
+
+| Mode | Description | Use Case |
+|------|-------------|----------|
+| **Single** | Single worker, Gradio UI | Development, testing |
+| **Parallel** | Multi-worker, VLM scheduler | Production deployment |
+
+---
+
+## Choosing the Right Application
+
+| Criteria | Dine-In | Take-Away |
+|----------|---------|-----------|
+| **Input Type** | Static images | Video streams (RTSP) |
+| **Throughput** | Low-medium | High (parallel) |
+| **Latency Priority** | Accuracy over speed | Speed and accuracy |
+| **Camera Setup** | Fixed position | Multi-station |
+| **Typical Use** | Table service | Drive-through, counter |
+| **Processing** | Single request | Batch processing |
+
+### Recommendation
+
+- **Choose Dine-In** if you need to validate orders from captured images at serving stations
+- **Choose Take-Away** if you need real-time validation from continuous video streams
+
+---
+
+## Shared Platform Components
+
+### VLM Backend (OVMS)
+
+Both applications use OpenVINO Model Server with Qwen2.5-VL for vision-language inference.
+
+### Semantic Comparison Service
+
+AI-powered semantic matching microservice for intelligent item comparison:
+
+- **Matching Strategies**: Exact → Semantic → Hybrid
+- **Example**: Matches "green apple" ↔ "apple" using semantic reasoning
+- **Fallback**: Automatic fallback to local matching if service unavailable
+
+---
+
+## What You Want to Do
+
+### 🚀 I'm New to Order Accuracy Solutions
+**Quick Start (25 minutes)**: [Getting Started Guide](./getting_started.md)
+- Set up your development environment
+- Run your first order validation demo
+- Understand the platform workflow
+
+### ⚙️ I Want to Customize the Solution
+**Advanced Configuration (45-90 minutes)**: [Advanced Guide](./advanced.md)
+- Configure custom settings and workloads
+- Tune performance parameters
+- Set up benchmarking
+
+### 📐 I Want to Understand the Architecture
+**System Design**: [Architecture Guide](./architecture.md)
+- Understand system components
+- Review data flow diagrams
+- Learn about service interactions
+
+### 📊 I Need Performance Analysis
+**Benchmark & Optimize**: [Performance Guide](./performance.md)
+- Compare CPU/GPU performance
+- Run stream density tests
+- Optimize for your hardware
diff --git a/docs_src/use-cases/order-accuracy/performance.md b/docs_src/use-cases/order-accuracy/performance.md
new file mode 100644
index 0000000..b7d89cf
--- /dev/null
+++ b/docs_src/use-cases/order-accuracy/performance.md
@@ -0,0 +1,265 @@
+# Performance Testing & Benchmarking
+
+Test your Order Accuracy pipeline performance on various hardware configurations. This guide covers everything from quick performance checks to comprehensive system capacity testing.
+
+## Quick Start (5 minutes)
+
+**Goal**: Run a basic performance test to verify your system works correctly
+
+### 1. Initialize Performance Tools
+```bash
+make update-submodules
+```
+
+### 2. Run Quick Benchmark
+
+**Dine-In:**
+```bash
+cd dine-in
+make benchmark
+```
+
+**Take-Away:**
+```bash
+cd take-away
+make benchmark
+```
+
+**What this does:**
+- Tests GPU/CPU performance for order validation
+- Measures end-to-end latency
+- Generates performance metrics
+- Outputs results to `results/` directory
+
+## Understanding Benchmark Types
+
+### Dine-In Benchmarks
+
+#### Single Request Benchmark
+```bash
+make benchmark
+```
+
+Tests single image validation latency:
+- Image preprocessing time
+- VLM inference time
+- Semantic matching time
+- Total end-to-end latency
+
+#### Stream Density Benchmark
+```bash
+make benchmark-density
+```
+
+Finds maximum concurrent requests the system can handle under latency constraints:
+- Target latency threshold (configurable)
+- Progressive load increase
+- Identifies performance ceiling
+
+### Take-Away Benchmarks
+
+#### Single Video Benchmark
+```bash
+make benchmark
+```
+
+Tests end-to-end latency for single order validation:
+- Video upload time
+- Frame extraction time
+- VLM inference latency
+- Validation time
+- Total processing time
+
+#### Fixed Workers Benchmark
+```bash
+make benchmark-oa BENCHMARK_WORKERS=4 BENCHMARK_DURATION=300
+```
+
+Tests system with fixed number of concurrent workers:
+- Throughput (orders/minute)
+- Latency percentiles (P50, P95, P99)
+- GPU utilization
+- Memory usage
+
+#### Stream Density Benchmark
+```bash
+make benchmark-stream-density
+```
+
+Finds maximum sustainable worker count under latency constraints:
+- Maximum concurrent workers
+- Latency at each worker count
+- Point of degradation
+- Resource utilization at capacity
+
+## Environment Variables Reference
+
+### Dine-In Configuration
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `TARGET_LATENCY_MS` | 15000 | Target latency threshold (ms) |
+| `LATENCY_METRIC` | avg | 'avg', 'p95', or 'max' |
+| `DENSITY_INCREMENT` | 1 | Concurrent images per iteration |
+| `INIT_DURATION` | 60 | Warmup time (seconds) |
+| `MIN_REQUESTS` | 3 | Min requests before measuring |
+| `REQUEST_TIMEOUT` | 300 | Individual request timeout (seconds) |
+| `API_ENDPOINT` | http://localhost:8083 | API endpoint URL |
+| `RESULTS_DIR` | ./results | Results output directory |
+
+### Take-Away Configuration
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `TARGET_LATENCY_MS` | 25000 | Target latency threshold (ms) |
+| `LATENCY_METRIC` | avg | 'avg', 'p95', or 'max' |
+| `WORKER_INCREMENT` | 1 | Workers added per iteration |
+| `INIT_DURATION` | 10 | Warmup time (seconds) |
+| `MIN_TRANSACTIONS` | 3 | Min transactions before measuring |
+| `MAX_ITERATIONS` | 50 | Max scaling iterations |
+| `MAX_WAIT_SEC` | 600 | Max wait per iteration (seconds) |
+| `BENCHMARK_WORKERS` | 1 | Number of workers (fixed mode) |
+| `BENCHMARK_DURATION` | 60 | Test duration (seconds) |
+
+## Hardware Testing Commands
+
+### GPU Performance Testing
+
+**Dine-In:**
+```bash
+# Ensure GPU device is configured in .env
+# OPENVINO_DEVICE=GPU
+make benchmark
+```
+
+**Take-Away:**
+```bash
+# Configure GPU in .env
+# OPENVINO_DEVICE=GPU
+make benchmark-oa BENCHMARK_WORKERS=4
+```
+
+### Multi-Worker Stress Testing (Take-Away)
+
+```bash
+# Test with 2 parallel workers
+make up-parallel WORKERS=2
+make benchmark-oa BENCHMARK_WORKERS=2
+
+# High stress test with 8 workers
+make up-parallel WORKERS=8
+make benchmark-oa BENCHMARK_WORKERS=8
+```
+
+### Progressive Load Testing
+
+```bash
+# Automatically find maximum sustainable workers
+make benchmark-stream-density \
+  BENCHMARK_TARGET_LATENCY_MS=25000 \
+  BENCHMARK_WORKER_INCREMENT=1 \
+  BENCHMARK_MAX_ITERATIONS=20
+```
+
+## Viewing Results
+
+### Dine-In Results
+```bash
+# View density benchmark results
+make benchmark-density-results
+
+# View raw results
+cat results/benchmark_results.json
+ls -la results/
+```
+
+### Take-Away Results
+```bash
+# View benchmark results
+make benchmark-oa-results
+
+# View density results
+cat results/stream_density_results.json
+ls -la results/
+```
+
+### Consolidate Metrics
+```bash
+make consolidate-metrics
+cat results/metrics_summary.csv
+```
+
+## Expected Performance
+
+### Typical Latency Ranges
+
+| Operation | Dine-In | Take-Away |
+|-----------|---------|-----------|
+| **Image Preprocessing** | 100-500ms | N/A |
+| **Frame Selection** | N/A | 200-500ms |
+| **VLM Inference** | 5-10s | 5-10s |
+| **Semantic Matching** | 50-200ms | 50-200ms |
+| **Total End-to-End** | 8-15s | 8-15s per order |
+
+### Hardware Impact
+
+| Configuration | Typical Performance |
+|---------------|---------------------|
+| **CPU Only** | 15-25s per validation |
+| **Intel iGPU** | 8-15s per validation |
+| **Intel Arc dGPU** | 5-10s per validation |
+| **NVIDIA RTX** | 4-8s per validation |
+
+### Throughput Expectations
+
+| Mode | Expected Throughput |
+|------|---------------------|
+| **Dine-In Single** | 4-6 orders/minute |
+| **Take-Away Single** | 4-6 orders/minute |
+| **Take-Away Parallel (4 workers)** | 16-24 orders/minute |
+| **Take-Away Parallel (8 workers)** | 30-40 orders/minute |
+
+## Optimization Tips
+
+### GPU Utilization
+- Monitor GPU usage with `nvidia-smi -l 1` or `intel_gpu_top`
+- Target 70-90% GPU utilization for optimal throughput
+- If GPU is underutilized, increase worker count
+
+### Memory Management
+- Monitor container memory with `docker stats`
+- VLM models require 8-16GB GPU memory
+- Reduce batch size if out-of-memory errors occur
+
+### Network Optimization (Take-Away)
+- Use wired connections for RTSP streams
+- Ensure 1Gbps+ network bandwidth per camera
+- Consider local video storage for testing
+
+### Latency Reduction
+- Use INT8 model quantization
+- Enable HTTP/2 for API connections
+- Pre-warm VLM model before benchmarking
+
+## Troubleshooting Performance Issues
+
+### Low FPS / High Latency
+- Check GPU driver installation
+- Verify OPENVINO_DEVICE setting in .env
+- Reduce image resolution or batch size
+- Check for thermal throttling
+
+### VLM Timeout Errors
+- Increase API_TIMEOUT in .env
+- Check GPU memory availability
+- Consider using smaller model precision
+
+### Memory Exhaustion
+- Reduce number of parallel workers
+- Lower batch size settings
+- Monitor with `docker stats`
+
+### Inconsistent Results
+- Increase warmup duration (INIT_DURATION)
+- Increase minimum transactions (MIN_TRANSACTIONS)
+- Run multiple benchmark iterations