AI OS Architecture 0xMeridian

Status: Architecture Whitepaper · PoC in progress · Rust/Swift contributors welcome

An AI-native Operating System that treats agents as kernel processes, not chat prompts.

Architecture Overview — TDD v5.1

Full architecture — all GAPs resolved. Agent Registry, 3-Bus communication, Observability, Zero Trust Security, Multi-Tenancy, and the QLoRA self-improvement loop — all integrated.

Why This Exists

In 2025, the state-of-the-art for working with AI agents looked like this:

.claude/agents/
├── engineering/
│   ├── frontend-developer.md
│   └── backend-architect.md
└── testing/
    └── api-tester.md

You manually create .md files for each "role", ask Claude to read a plan before coding, clean context between sessions, and fix prompts by hand when the agent makes mistakes.

These are good heuristics. They are also artisanal workarounds for the absence of a real system.

AI OS replaces every one of those workarounds with a protocol.

The 8 Rules vs. The OS

Manual Approach (2025 Best Practice)	AI OS (Magellano Architecture)
Create a `.md` file per agent role	3-Tier Taxonomy (Macro/Meso/Micro) — agents spawn dynamically from Registry
Ask Claude to "plan first"	Planner Agent generates a mathematical DAG before any execution
Maintain `CLAUDE.md` for project memory	4-Layer State: Working Memory → Vector Store → Knowledge Graph (Neo4j) → Persistent Store
Write "Constraints" sections in prompts	Zero Trust Security + ED25519-signed messages + Sandbox execution (Docker/Wasm)
Open a new chat to "clean context"	Session State managed by Kernel — KV cache allocated/deallocated automatically
Separate test agent from coding agent	Critic Service integrated in execution loop — output blocked if Quality Score < 0.85
Human does the git commit	Escalation Policy: system self-recovers, escalates to human only for Tier 1 decisions
Fix the `.md` file when agent fails	Nightly QLoRA fine-tuning loop — model retrains on its own errors while you sleep

The last row is the gap that matters most. The manual approach requires a human to notice the error, open a file, write a correction, and hope it generalizes. The OS collects feedback during the session, builds a training batch, and fine-tunes the base model overnight. The next morning, the error no longer exists — in the weights.

Architecture

┌─────────────────────────────────────────────────────────┐
│                      AI OS KERNEL (Rust)                │
│                                                         │
│   ┌─────────────┐    ┌──────────────────────────────┐   │
│   │Task Scheduler│   │   Agent Swarm Orchestrator   │   │
│   │Resource Mgr │◄──►│  Planner · Executor · Critic     │
│   │Context Mgr  │    │  Memory  · Interface Agent   │   │
│   │Model Loader │    └──────────────────────────────┘   │
│   │Tool Registry│              │                        │
│   └─────────────┘    ┌─────────▼──────────────────────┐ │
│                      │       3-Bus Communication      │ │
│                      │  gRPC(1-10ms) · NATS(10-100ms) │ │
│                      │  SharedMem(<1µs · zero-copy)   │ │
│                      └────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
         │ Metal API / CoreML
┌────────▼────────────────────────────────────────────────┐
│              Magellano Inference Engine (Swift)         │
│  3.3B params · Apple Silicon · NF4 quant · 50-100ms     │
│  QLoRA nightly loop · A/B gate (≥5%) · Hot-swap deploy  │
└─────────────────────────────────────────────────────────┘

What Changed vs. the Original HLD

The initial HLD showed 5 kernel services + 5 agents + a generic "Shared Message Bus". During TDD development, 8 architectural gaps were identified and resolved:

Gap	What was missing	Resolution
GAP-01	No observability platform	Prometheus+Thanos, OTel+Jaeger, Loki, Grafana — 6 SLO alerts
GAP-03	No data pipeline	CQRS + Event Sourcing on NATS JetStream, 3 processing stages
GAP-04	No deployment strategy	Phase α Docker Compose → β K3s → γ Enterprise, GitHub Actions CI/CD
GAP-05	No multi-tenancy	Namespace isolation, Free/Pro/Enterprise plans, event-driven billing
GAP-10	No accessibility	WCAG AA, Voice I/O, AccessibilityChecker in Critic Agent
GAP-11	"Agents" was a flat list	3-Tier Taxonomy: Macro (5) → Meso (6) → Micro (millions) + Agent Registry
GAP-12	Magellano boundary undefined	ADR-003: clear Rust↔Swift contract via InferenceBackend HAL trait
GAP-13	No inference abstraction	`InferenceBackend` Rust trait (7 methods), 4 routing policies, fallback chain

Core Design Decisions

Rust Kernel — Memory safety without GC, predictable latency, native async with Tokio. The kernel orchestrates agents as processes, not chat turns.

Swift + Metal Inference (Magellano) — Native Apple Silicon execution. Inference latency ~50-100ms on M-series. 3.3B parameters, NF4 quantized (6.6GB → 1.7GB RAM).

FIPA-ACL over gRPC — Agents communicate via a formal Agent Communication Language. Every message is typed, versioned, and ED25519-signed. No prompt injection possible at the protocol layer.

3-Bus Architecture — Control traffic (gRPC, 1-10ms), data/results (NATS, 10-100ms), tensor/embeddings (Shared Memory, <1µs zero-copy DMA). Mixing these causes head-of-line blocking — they are strictly separated.

4-Layer Memory — Context retrieval uses semantic RAG (dense + sparse FAISS, top-K=5, reranked in ~80ms total), not file reads. The Knowledge Graph (Neo4j) stores relationships that flat vector search cannot represent.

Critic-in-the-Loop — No output leaves the system without a Quality Score ≥ 0.85 across 5 weighted dimensions. The Critic is not a prompt; it is a kernel service with veto power.

Two-Loop Learning

Fast loop: Feedback updates Vector Store and KG in-session (real-time)
Slow loop: Nightly QLoRA training (rank 16, 7 target modules, ~0.5% trainable parameters) → ~35MB adapter → A/B gate (≥5% improvement required) → hot-swap, zero downtime

Performance Targets (TDD v5.1)

Phase	Component	Duration
Input parse	Interface Agent	15ms
Registry lookup	Agent Registry	5ms
RAG retrieval	Memory Agent	~80ms
Planning (DAG)	Planner Agent	~180ms
Inference	Magellano / Metal	50–100ms
Validation	Critic Service	~120ms
Execution + output	Executor Agent	~4.3s
TOTAL E2E	Happy path	~4.8s
Nightly QLoRA	Full cycle	~45min

Repository Structure

/
├── README.md
├── CONTRIBUTING.md
├── docs/
│   ├── AI_OS_TDD_v5_1.md                       # Full Technical Design Document 
                                                  (Sequence diagrams: timing, error   recovery, QLoRA loop)
│   └── ADR/                                    # Architecture Decision Records (ADR-001 to ADR-004)
├── diagrams/
│   ├── ai_os_manifesto_v51.html                # ← Full v5.1 architecture (this README's header)
│   ├── ai_os_architecture.png                  # Original HLD (pre-gap-resolution reference)
│   ├── ai_os_kernel_exploded.png
│   ├── ai_os_agent_swarm_orchestrator.png
│   ├── ai_os_security_layer.png
│   ├── ai_os_memory_agent.png
│   └── *.html                                  # Interactive Mermaid viewers
└── poc/                                        # (in progress) Rust kernel skeleton

Current Status

Component	Status
Architecture (TDD v5.1)	✅ Complete ~ 4600+ lines
8 GAPs identified and resolved	✅ All closed
Sequence Diagrams (5 scenarios)	✅ Complete — Addendum C.1–C.4
ADR-001÷004	✅ Complete
QLoRA Learning Path	✅ Complete
Multi-Tenancy model	✅ Complete
Security Threat Model (10 threats)	✅ Complete
Rust Kernel PoC	🔄 In progress
Magellano Swift inference	🔄 Design phase
Integration tests	⏳ Pending PoC

Contributing

Looking for contributors with experience in:

Rust (async, Tokio, tonic/gRPC) — Kernel and orchestration layer
Swift + Metal / CoreML — Magellano inference engine
FIPA-ACL / multi-agent systems — Protocol implementation

Read CONTRIBUTING.md before opening issues. The TDD v5.1 is the source of truth — start from the relevant section and open a discussion issue before implementing.

License

Architecture documentation: Creative Commons BY-SA 4.0
Code (when published): Apache 2.0

Acknowledgments

Technical Partners: Claude Opus-Sonnet (Anthropic) | Kimi Thinking (Moonshot AI) | Research Lead: Alessandro La Gamba

Author

Alessandro La Gamba Senior System Engineer | AI/ML Researcher | 25+ years experience | distributed systems and edge AI

Version: v1 | Status: DEV | February 2026 "The best prompt engineering is no prompt engineering."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI OS Architecture 0xMeridian

Architecture Overview — TDD v5.1

Why This Exists

The 8 Rules vs. The OS

Architecture

What Changed vs. the Original HLD

Core Design Decisions

Performance Targets (TDD v5.1)

Repository Structure

Current Status

Contributing

License

Acknowledgments

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
diagrams		diagrams
docs		docs
embodied-swarm-architecture		embodied-swarm-architecture
use-cases		use-cases
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

AI OS Architecture 0xMeridian

Architecture Overview — TDD v5.1

Why This Exists

The 8 Rules vs. The OS

Architecture

What Changed vs. the Original HLD

Core Design Decisions

Performance Targets (TDD v5.1)

Repository Structure

Current Status

Contributing

License

Acknowledgments

Author

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages