Unified LLM API Gateway β aggregates multiple LLM providers behind a single API with caching, rate limiting, and observability.
- docker-compose up --build
- Gateway: http://localhost:8080/v1/llm/chat
- Admin: http://localhost:3000
- gateway (Go): edge gateway
- adapters: provider adapters
- admin (NestJS): API key & usage dashboard
- infra: docker-compose / k8s manifests
- Gateway unit tests: cd gateway && go test ./...
- Admin tests: cd admin && npm test
Unified LLM API Gateway a scalable API gateway that aggregates and normalizes calls to multiple LLM backends (OpenAI, HF Inference, self-hosted models), with caching, rate limiting, logging, metrics, and deployment manifests for Docker/Kubernetes.
- API Gateway (edge): accepts client requests, auth, routing, request transforms, aggregator/fan-out to LLM backends.
- LLM Adapters (microservices): small services that wrap each providerβs API (OpenAI, Hugging Face, etc.) to expose a unified internal interface.
- Cache layer: Redis (result caching, cache keys based on prompt+params).
- Rate limiter: Redis-based leaky-bucket or token-bucket (shared across instances).
- Auth & Quotas: API keys / JWT + per-key quotas stored in Redis or DB.
- Observability: Structured logs (JSON), metrics exported in Prometheus format, traces.
- Deployment: Docker images, Helm charts or Kubernetes manifests, CI builds and image publishing.
llm-api-gateway/
βββ README.md
βββ LICENSE
βββ .github/
β   βββ workflows/ci.yml
βββ infra/
β   βββ k8s/                  # k8s manifests or Helm charts
β   βββ docker-compose.yml
βββ gateway/                  # Go API gateway
β   βββ cmd/
β   β   βββ server/
β   β       βββ main.go
β   βββ internal/
β   β   βββ handlers/
β   β   βββ adapters/
β   β   βββ cache/
β   β   βββ ratelimit/
β   β   βββ metrics/
β   βββ go.mod
β   βββ Dockerfile
βββ adapters/                 # per-provider adapters
β   βββ openai-adapter/
β   βββ hf-adapter/
βββ admin/                    # NestJS service for keys, dashboard, logs
β   βββ src/
β   βββ package.json
β   βββ Dockerfile
βββ tooling/
    βββ tests/                # e2e test helpers