Declarative GitOps-based infrastructure stack for quantitative research and backtesting.
Provisions and manages a single-node Kubernetes cluster on Oracle Cloud Infrastructure (OCI) using MicroK8s and Argo CD. All workloads are managed declaratively via GitOps after a one-time bootstrap.
Designed for reproducible research infrastructure with explicit storage, secret management, and operational boundaries.
This repository defines a fully declarative infrastructure layer for a quantitative trading research platform.
It bootstraps:
- A single-node MicroK8s Kubernetes cluster
- GitOps management via Argo CD
- Integrated experiment tracking, monitoring, orchestration
- Explicit local storage model
- OCI-native secret integration
After bootstrap, all cluster state is managed through Git via Argo CD.
No imperative Kubernetes workflows are required.
Research infrastructure often suffers from:
- Manual Kubernetes management
- Secret sprawl
- Mixed imperative and declarative workflows
- Poor reproducibility
- Overengineered multi-node setups for small research workloads
This stack provides:
- GitOps-first cluster lifecycle
- Strict secret isolation via OCI Vault
- Explicit storage boundaries (boot vs scratch)
- Fully declarative application management
- Minimal single-node production-grade design
It enables structured research infrastructure without managed Kubernetes complexity.
The system is divided into two clear layers.
Installed once via bootstrap script:
- MicroK8s
- Secrets Store CSI Driver
- OCI Secrets Store Provider custom multi-arch image
- Argo CD
- Scratch Block Volume formatting and mounting
Managed exclusively via Argo CD:
- PostgreSQL (MLflow metadata)
- MLflow
- Prometheus + Grafana
- Argo Workflows
- Scratch PersistentVolume
All workloads are defined declaratively in:
apps/
argocd/
- MicroK8s (Kubernetes distribution)
- Argo CD (GitOps continuous delivery)
- PostgreSQL (metadata & experiment storage)
- MLflow (experiment tracking & model registry)
- Prometheus + Grafana (monitoring & observability)
- Argo Workflows (batch & pipeline execution)
- Oracle Cloud Infrastructure (compute, networking, secrets, storage)
apps/
mlflow/
postgres/
monitoring/
argo/
scratch/
argocd/
mlflow-app.yaml
postgres-app.yaml
monitoring-app.yaml
argo-app.yaml
scratch-app.yaml
infrastructure/
oci-provider/
provider.yaml
scripts/
bootstrap-cluster.sh
- Ubuntu VM
- Attached Block Volume for Scratch (
/dev/oracleoci/oraclevds) - OCI Instance Principal configured
- Repository cloned onto the VM
- OCI Vault with predefined secrets
Create environment file:
cp .env.example .envConfigure:
VAULT_ID=ocid1.vault.oc1.eu-frankfurt-1.xxxxx
OCI_REGION=eu-frankfurt-1Load environment:
set -a
source .env
set +aβ Bootstrap is one-shot only. Run on a fresh VM or perform a full reset. Re-running on an existing cluster is not supported. All subsequent changes must occur via GitOps.
chmod +x scripts/*
./scripts/bootstrap-cluster.shInstalls:
- MicroK8s
- CSI Driver
- OCI Provider
- Argo CD
Deploys:
- PostgreSQL
- MLflow
- Monitoring
- Argo Workflows
- Scratch PV/PVC
Kubernetes services are exposed internally as NodePorts.
OCI network configuration:
- Only SSH (port 22) allowed externally
- All service ports blocked externally
- Access exclusively via SSH local port forwarding
Example SSH configuration:
Host vps
HostName <IP>
User ubuntu
IdentityFile ~/.ssh/<key>
LocalForward 30007 localhost:30007 # Grafana
LocalForward 32120 localhost:32120 # Argo Workflows
LocalForward 30090 localhost:30090 # Prometheus
LocalForward 30500 localhost:30500 # MLflowThis enables development access while preventing public exposure.
Explicit separation between system storage and workload storage.
-
~47 GB (minimum OCI size)
-
Hosts:
- Operating system
- Kubernetes system data
- PostgreSQL database
MLflow metadata persists in local PostgreSQL backed by boot volume.
- 153 GB (β142.5 GiB)
- Mounted at:
/mnt/scratch
Intended for:
- Backtesting data
- Research artifacts
- Large intermediate datasets
Exposed via PersistentVolume / PersistentVolumeClaim.
β Single-node only (hostPath-based).
Old VM:
sudo microk8s kubectl delete namespace scratch
sudo microk8s kubectl delete pv scratch-pv
sudo umount /mnt/scratch/Detach volume.
New VM:
- Attach volume with same device name
- Run bootstrap
- Ensure PV/PVC names match
- Data reused automatically
sudo snap remove microk8s --purge
sudo rm -rf /var/snap/microk8s/
sudo rm -rf ~/.kube/All sensitive configuration is stored in OCI Vault.
Requirements:
- OCI Vault must exist
- Secrets must be created in advance
- Secret names must match Helm/YAML references
Secrets retrieved using:
- Secrets Store CSI Driver
- OCI Provider (custom multi-arch image)
- Instance Principal authentication
Secrets defined via SecretProviderClass.
No secrets stored in Git.
Prometheus runs with ephemeral local storage by default.
- Metrics stored inside pod filesystem
- No PersistentVolume configured
- Data lost on pod restart or node reboot
Intended for lightweight research environments.
Because metrics are local:
df -h /
sudo du -h --max-depth=1 /var/snap/microk8s/common/var/Used to track disk usage and prevent exhaustion.
OCI selected primarily for ARM free tier:
- 4 vCPU ARM VM
- 24 GB RAM
- ~200 GB free storage
Suitable for:
- Single-node Kubernetes research clusters
- Persistent external storage
- Zero-cost experimentation
Boot volume kept minimal; scratch volume handles data-heavy workloads.
- Modify YAML / Helm values in
apps/<component>/ - Commit & push
- Argo CD auto-syncs
No manual kubectl required.
- No secrets in Git
- Public VM with OCI firewall (NSGs / Security Lists)
- Only SSH exposed
- All service ports closed externally
- Egress allowed
- Secrets in OCI Vault
- GitOps-first
- Single source of truth
- Declarative workflows only
- Multi-arch native
- Minimal but production-grade
Operational
- MicroK8s bootstrap
- Argo CD GitOps layer
- OCI secrets integration
- PostgreSQL
- Scratch storage model
Experimental
- Monitoring stack
- Argo Workflows pipelines
- Higher-level research workflows
- Quant research & backtesting
- GitOps Kubernetes experimentation
- Reproducible infrastructure setups
Not intended for multi-node production clusters or managed Kubernetes platforms.
MIT license.
Semantic versioning.
Initial public release: v0.1.0.