|
| 1 | +# Claude Code Handoff: UniFi Layer Fabric Deployment |
| 2 | + |
| 3 | +## Context |
| 4 | + |
| 5 | +You're continuing a design session where we built a **serverless AI layer fabric** for UniFi network management. The architecture treats every capability (routing, reasoning, execution, memory) as an independent Kubernetes layer that scales to zero via KEDA. |
| 6 | + |
| 7 | +This replaces the traditional "one big LLM" approach with composable layers that wake on demand. |
| 8 | + |
| 9 | +## What's Been Built |
| 10 | + |
| 11 | +The `unifi-layer-fabric/` directory contains: |
| 12 | + |
| 13 | +``` |
| 14 | +unifi-layer-fabric/ |
| 15 | +├── README.md # Architecture overview |
| 16 | +├── docs/ |
| 17 | +│ └── QUICKSTART.md # Deployment guide |
| 18 | +├── argocd/ |
| 19 | +│ ├── applicationset.yaml # GitOps deployment |
| 20 | +│ └── secrets.yaml # Credential templates |
| 21 | +└── charts/ |
| 22 | + ├── cortex-activator/ # Always-on query router |
| 23 | + ├── cortex-qdrant/ # Always-on vector memory |
| 24 | + ├── reasoning-classifier/ # Qwen2-0.5B (fast classification) |
| 25 | + ├── reasoning-slm/ # Phi-3-3.8B (tool calling) |
| 26 | + ├── execution-unifi-api/ # UniFi API operations |
| 27 | + ├── execution-unifi-ssh/ # SSH failover/diagnostics |
| 28 | + └── cortex-telemetry/ # Metrics, audit, learning |
| 29 | +``` |
| 30 | + |
| 31 | +## What Needs To Be Done |
| 32 | + |
| 33 | +### 1. Build Container Images |
| 34 | + |
| 35 | +The Helm charts reference these images that need source code + Dockerfiles: |
| 36 | + |
| 37 | +| Image | Purpose | Language | |
| 38 | +|-------|---------|----------| |
| 39 | +| `ghcr.io/ry-ops/cortex-activator` | Query routing, layer orchestration | Python/FastAPI or Go | |
| 40 | +| `ghcr.io/ry-ops/unifi-action-engine` | UniFi API wrapper | Python | |
| 41 | +| `ghcr.io/ry-ops/unifi-ssh-gateway` | SSH command execution | Python | |
| 42 | +| `ghcr.io/ry-ops/cortex-telemetry` | Metrics collection, Qdrant writes | Python | |
| 43 | + |
| 44 | +### 2. Push to GitHub |
| 45 | + |
| 46 | +Create repo: `github.com/ry-ops/unifi-layer-fabric` |
| 47 | + |
| 48 | +Structure: |
| 49 | +``` |
| 50 | +unifi-layer-fabric/ |
| 51 | +├── src/ |
| 52 | +│ ├── activator/ # Cortex Activator source |
| 53 | +│ ├── action-engine/ # UniFi Action Engine source |
| 54 | +│ ├── ssh-gateway/ # SSH Gateway source |
| 55 | +│ └── telemetry/ # Telemetry collector source |
| 56 | +├── charts/ # Helm charts (already built) |
| 57 | +├── argocd/ # ArgoCD manifests |
| 58 | +├── .github/workflows/ # CI/CD for building images |
| 59 | +└── environments/ |
| 60 | + └── production/ # Production value overrides |
| 61 | +``` |
| 62 | + |
| 63 | +### 3. Configure ArgoCD |
| 64 | + |
| 65 | +```bash |
| 66 | +# Add the ApplicationSet to ArgoCD |
| 67 | +kubectl apply -f argocd/applicationset.yaml |
| 68 | +``` |
| 69 | + |
| 70 | +### 4. Create Secrets |
| 71 | + |
| 72 | +```bash |
| 73 | +kubectl create namespace cortex-unifi |
| 74 | + |
| 75 | +kubectl create secret generic unifi-credentials \ |
| 76 | + --namespace cortex-unifi \ |
| 77 | + --from-literal=api-key="SITE_MANAGER_API_KEY" \ |
| 78 | + --from-literal=controller-host="https://UDM_PRO_IP" \ |
| 79 | + --from-literal=controller-username="admin" \ |
| 80 | + --from-literal=controller-password="CONTROLLER_PASSWORD" \ |
| 81 | + --from-literal=ssh-host="UDM_PRO_IP" \ |
| 82 | + --from-literal=ssh-username="root" \ |
| 83 | + --from-literal=ssh-password="SSH_PASSWORD" |
| 84 | +``` |
| 85 | + |
| 86 | +### 5. Monitor Deployment |
| 87 | + |
| 88 | +```bash |
| 89 | +# Watch ArgoCD sync |
| 90 | +argocd app list | grep unifi |
| 91 | + |
| 92 | +# Watch pods |
| 93 | +kubectl get pods -n cortex-unifi -w |
| 94 | + |
| 95 | +# Check KEDA ScaledObjects |
| 96 | +kubectl get scaledobjects -n cortex-unifi |
| 97 | +``` |
| 98 | + |
| 99 | +### 6. Test |
| 100 | + |
| 101 | +```bash |
| 102 | +# Port forward to activator |
| 103 | +kubectl port-forward svc/cortex-activator -n cortex-unifi 8080:8080 |
| 104 | + |
| 105 | +# Test query |
| 106 | +curl -X POST http://localhost:8080/query \ |
| 107 | + -H "Content-Type: application/json" \ |
| 108 | + -d '{"query": "List all clients on the network"}' |
| 109 | +``` |
| 110 | + |
| 111 | +## Architecture Summary |
| 112 | + |
| 113 | +``` |
| 114 | +┌─────────────────────────────────────────────────────────────────────────┐ |
| 115 | +│ USER QUERY │ |
| 116 | +└─────────────────────────────┬───────────────────────────────────────────┘ |
| 117 | + │ |
| 118 | + ▼ |
| 119 | +┌─────────────────────────────────────────────────────────────────────────┐ |
| 120 | +│ CORTEX ACTIVATOR (Always On, ~128MB) │ |
| 121 | +│ 1. Keyword match → direct to execution layer (90% of queries) │ |
| 122 | +│ 2. Ambiguous → wake classifier layer │ |
| 123 | +│ 3. Complex → wake SLM reasoning layer │ |
| 124 | +└─────────────────────────────┬───────────────────────────────────────────┘ |
| 125 | + │ |
| 126 | + ┌─────────────────────┼─────────────────────┐ |
| 127 | + ▼ ▼ ▼ |
| 128 | +┌───────────────┐ ┌───────────────┐ ┌───────────────┐ |
| 129 | +│ REASONING │ │ QDRANT │ │ EXECUTION │ |
| 130 | +│ (Scale 0→1) │ │ (Always On) │ │ (Scale 0→1) │ |
| 131 | +├───────────────┤ ├───────────────┤ ├───────────────┤ |
| 132 | +│ • Classifier │ │ • Operations │ │ • UniFi API │ |
| 133 | +│ (0.5B) │ │ • Configs │ │ • SSH Gateway │ |
| 134 | +│ • SLM (3.8B) │ │ • Patterns │ │ │ |
| 135 | +└───────────────┘ └───────────────┘ └───────────────┘ |
| 136 | +``` |
| 137 | + |
| 138 | +## Memory Profile |
| 139 | + |
| 140 | +| State | Memory | What's Running | |
| 141 | +|-------|--------|----------------| |
| 142 | +| **Idle** | ~640MB | Activator + Qdrant | |
| 143 | +| **Simple Query** | ~1GB | + Execution layer | |
| 144 | +| **Complex Query** | ~4GB | + SLM reasoning | |
| 145 | +| **Full Active** | ~4.5GB | All layers warm | |
| 146 | + |
| 147 | +## Key Files to Review |
| 148 | + |
| 149 | +1. `charts/cortex-activator/values.yaml` - Routing rules, layer endpoints |
| 150 | +2. `charts/reasoning-slm/values.yaml` - Model config, system prompt |
| 151 | +3. `charts/execution-unifi-api/values.yaml` - Action definitions |
| 152 | +4. `charts/execution-unifi-ssh/values.yaml` - Allowed SSH commands |
| 153 | +5. `argocd/applicationset.yaml` - GitOps deployment config |
| 154 | + |
| 155 | +## Environment Details |
| 156 | + |
| 157 | +- **Cluster**: k3s on Proxmox (7 nodes) |
| 158 | +- **RAM**: 64GB total, ~8-12GB available for Cortex |
| 159 | +- **CPU**: 20 cores |
| 160 | +- **GPU**: None (CPU inference only) |
| 161 | +- **Storage**: Longhorn CSI |
| 162 | +- **GitOps**: ArgoCD |
| 163 | +- **UniFi**: 1 site, UDM Pro |
| 164 | + |
| 165 | +## After Deployment: Blog Post |
| 166 | + |
| 167 | +Once running, create a blog post for ry-ops.dev covering: |
| 168 | + |
| 169 | +1. **The Journey** - From monolithic LLM to composable layers |
| 170 | +2. **Why Serverless AI** - Cost savings, resource efficiency |
| 171 | +3. **Architecture Deep Dive** - How layers communicate |
| 172 | +4. **Real Performance** - Cold start times, memory usage |
| 173 | +5. **Learning Loop** - How it improves over time |
| 174 | +6. **What's Next** - Extending to other MCP servers |
| 175 | + |
| 176 | +## Questions for Ryan |
| 177 | + |
| 178 | +Before deploying, confirm: |
| 179 | +1. UDM Pro IP address for controller-host and ssh-host |
| 180 | +2. Site Manager API key (from ui.com account) |
| 181 | +3. Controller admin credentials |
| 182 | +4. SSH credentials for UDM Pro |
| 183 | +5. GitHub repo name (suggested: `ry-ops/unifi-layer-fabric`) |
| 184 | +6. Container registry (suggested: `ghcr.io/ry-ops/`) |
0 commit comments