Skip to content

Commit d30517b

Browse files
committed
docs(fern): Add Kata Containers tutorial & examples
Introduce a new Kata Containers tutorial and add example artifacts to support running OpenShell sandboxes inside Kata VMs. Changes include: adding a Kata Containers card to the tutorials index, a full tutorial (docs/tutorials/kata-containers.mdx), example sandbox Dockerfiles for Claude and Python agents, a README for the examples, a Python SDK script to create sandboxes with a runtimeClass, a RuntimeClass manifest, several policy YAMLs (claude-code, minimal, L7 GitHub), and a DaemonSet to install the supervisor binary on cluster nodes. These additions provide step-by-step instructions, sample images, network policies, and deployment manifests for hardware-level VM isolation on Kubernetes. Signed-off-by: smarunich <smarunich@nvidia.com>
1 parent 355d845 commit d30517b

File tree

11 files changed

+808
-0
lines changed

11 files changed

+808
-0
lines changed

docs/tutorials/index.mdx

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,4 +31,9 @@ Route inference through Ollama using cloud-hosted or local models, and verify it
3131

3232
Route inference to a local LM Studio server via the OpenAI or Anthropic compatible APIs.
3333
</Card>
34+
35+
<Card title="Kata Containers" href="/tutorials/kata-containers">
36+
37+
Run sandboxes inside Kata Container VMs on Kubernetes for hardware-level isolation on top of OpenShell policy enforcement.
38+
</Card>
3439
</Cards>

docs/tutorials/kata-containers.mdx

Lines changed: 314 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,314 @@
1+
---
2+
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
# SPDX-License-Identifier: Apache-2.0
4+
title: "Run Sandboxes in Kata Container VMs"
5+
sidebar-title: "Kata Containers"
6+
slug: "tutorials/kata-containers"
7+
description: "Deploy OpenShell sandboxes inside Kata Container VMs on Kubernetes for hardware-level isolation, with step-by-step instructions for Kata setup, gateway deployment, image building, and agent execution."
8+
keywords: "Generative AI, Cybersecurity, Tutorial, Kata Containers, Sandbox, VM, Kubernetes, Isolation"
9+
---
10+
11+
Run OpenShell sandboxes inside Kata Container VMs on Kubernetes. Each sandbox pod runs in its own lightweight VM with a dedicated guest kernel, adding hardware-level isolation on top of OpenShell's Landlock, seccomp, and network namespace enforcement.
12+
13+
After completing this tutorial you will have:
14+
15+
- Kata Containers installed and verified on your Kubernetes cluster.
16+
- The OpenShell gateway deployed via Helm with the supervisor binary on every node.
17+
- A custom sandbox image with your AI agent baked in.
18+
- A running sandbox inside a Kata VM with network policy enforcement.
19+
20+
## Prerequisites
21+
22+
- A Kubernetes cluster (v1.26+) with admin access.
23+
- Nodes with hardware virtualization support (Intel VT-x / AMD-V).
24+
- `kubectl`, `helm` v3, and Docker installed locally.
25+
- OpenShell CLI installed. See the [Quickstart](/get-started/quickstart) if you have not installed it yet.
26+
27+
## Architecture
28+
29+
Three layers of isolation protect your infrastructure:
30+
31+
1. **Kata VM** -- each sandbox pod runs inside a QEMU or Cloud Hypervisor microVM with its own guest kernel.
32+
2. **OpenShell sandbox** -- inside the VM, the supervisor enforces Landlock filesystem rules, seccomp-BPF syscall filters, and a dedicated network namespace.
33+
3. **Egress proxy and OPA** -- all outbound traffic passes through an HTTP CONNECT proxy that evaluates per-binary, per-endpoint network policy via an embedded OPA/Rego engine.
34+
35+
<Steps toc={true}>
36+
37+
## Install Kata Containers
38+
39+
kata-deploy is a DaemonSet that installs Kata binaries and configures containerd on every node.
40+
41+
```shell
42+
kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/kata-rbac/base/kata-rbac.yaml
43+
kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml
44+
kubectl -n kube-system wait --for=condition=Ready pod -l name=kata-deploy --timeout=600s
45+
```
46+
47+
Verify the RuntimeClass exists:
48+
49+
```shell
50+
kubectl get runtimeclass
51+
```
52+
53+
If `kata-containers` is not listed, create it manually. An example manifest is at [examples/kata-containers/kata-runtimeclass.yaml](https://github.com/NVIDIA/OpenShell/blob/main/examples/kata-containers/kata-runtimeclass.yaml):
54+
55+
```shell
56+
kubectl apply -f examples/kata-containers/kata-runtimeclass.yaml
57+
```
58+
59+
Confirm Kata works by running a test pod:
60+
61+
```shell
62+
kubectl run kata-test --image=busybox --restart=Never \
63+
--overrides='{"spec":{"runtimeClassName":"kata-containers"}}' \
64+
-- uname -a
65+
kubectl wait --for=condition=Ready pod/kata-test --timeout=60s
66+
kubectl logs kata-test
67+
kubectl delete pod kata-test
68+
```
69+
70+
If the kernel version differs from your host, Kata is working.
71+
72+
## Deploy the Supervisor Binary
73+
74+
The Kubernetes driver side-loads the `openshell-sandbox` supervisor from `/opt/openshell/bin/openshell-sandbox` on the node via a hostPath volume. For the built-in k3s cluster this is already present. For your own cluster, deploy the installer DaemonSet:
75+
76+
```shell
77+
kubectl create namespace openshell
78+
kubectl apply -f examples/kata-containers/supervisor-daemonset.yaml
79+
```
80+
81+
Wait for it to roll out:
82+
83+
```shell
84+
kubectl -n openshell rollout status daemonset/openshell-supervisor-installer
85+
```
86+
87+
## Deploy the Agent-Sandbox CRD
88+
89+
Install the Sandbox Custom Resource Definition and its controller:
90+
91+
```shell
92+
kubectl apply -f https://raw.githubusercontent.com/NVIDIA/OpenShell/main/deploy/kube/manifests/agent-sandbox.yaml
93+
kubectl get crd sandboxes.agents.x-k8s.io
94+
kubectl -n agent-sandbox-system get pods
95+
```
96+
97+
## Create Secrets
98+
99+
The gateway requires mTLS certificates and an SSH handshake secret. The simplest approach is to bootstrap a local gateway, extract the PKI material, then create Kubernetes secrets:
100+
101+
```shell
102+
openshell gateway start
103+
GATEWAY_DIR=$(ls -d ~/.config/openshell/gateways/*/mtls | head -1)
104+
105+
kubectl -n openshell create secret tls openshell-server-tls \
106+
--cert="$GATEWAY_DIR/../server.crt" \
107+
--key="$GATEWAY_DIR/../server.key"
108+
109+
kubectl -n openshell create secret generic openshell-server-client-ca \
110+
--from-file=ca.crt="$GATEWAY_DIR/ca.crt"
111+
112+
kubectl -n openshell create secret tls openshell-client-tls \
113+
--cert="$GATEWAY_DIR/client.crt" \
114+
--key="$GATEWAY_DIR/client.key"
115+
116+
kubectl -n openshell create secret generic openshell-ssh-handshake \
117+
--from-literal=secret=$(openssl rand -hex 32)
118+
119+
openshell gateway stop
120+
```
121+
122+
## Install the Gateway via Helm
123+
124+
```shell
125+
helm install openshell deploy/helm/openshell/ \
126+
--namespace openshell \
127+
--set server.sandboxNamespace=openshell \
128+
--set server.sandboxImage=ghcr.io/nvidia/openshell-community/sandboxes/base:latest \
129+
--set server.grpcEndpoint=https://openshell.openshell.svc.cluster.local:8080 \
130+
--set server.sshGatewayHost=<YOUR_EXTERNAL_HOST> \
131+
--set server.sshGatewayPort=30051
132+
```
133+
134+
Replace `<YOUR_EXTERNAL_HOST>` with the externally reachable address of your cluster's NodePort or load balancer.
135+
136+
Wait for the gateway:
137+
138+
```shell
139+
kubectl -n openshell rollout status statefulset/openshell --timeout=300s
140+
```
141+
142+
Register it with the CLI:
143+
144+
```shell
145+
openshell gateway add --name kata-cluster --endpoint https://<YOUR_EXTERNAL_HOST>:30051
146+
```
147+
148+
## Build a Sandbox Image
149+
150+
Your image provides the agent and its dependencies. OpenShell replaces the entrypoint at runtime with its supervisor, so pass the agent start command after `--` on the CLI. Key requirements:
151+
152+
- Standard Linux base image (not distroless or `FROM scratch`).
153+
- `iproute2` installed (required for network namespace isolation).
154+
- `iptables` installed (recommended for bypass detection).
155+
- A `sandbox` user with uid/gid 1000.
156+
157+
Example for Claude Code (see [Dockerfile.claude-code](https://github.com/NVIDIA/OpenShell/blob/main/examples/kata-containers/Dockerfile.claude-code)):
158+
159+
```dockerfile
160+
FROM node:22-slim
161+
162+
RUN apt-get update && apt-get install -y --no-install-recommends \
163+
curl iproute2 iptables git openssh-client ca-certificates \
164+
&& rm -rf /var/lib/apt/lists/*
165+
166+
RUN npm install -g @anthropic-ai/claude-code
167+
168+
RUN groupadd -g 1000 sandbox && \
169+
useradd -m -u 1000 -g sandbox -s /bin/bash sandbox
170+
171+
WORKDIR /sandbox
172+
```
173+
174+
Build and push to a registry your cluster can reach:
175+
176+
```shell
177+
docker build -t myregistry.com/claude-sandbox:latest \
178+
-f examples/kata-containers/Dockerfile.claude-code .
179+
docker push myregistry.com/claude-sandbox:latest
180+
```
181+
182+
## Configure a Provider
183+
184+
Providers inject API keys into sandboxes. Create one from your local environment:
185+
186+
```shell
187+
export ANTHROPIC_API_KEY=sk-ant-...
188+
openshell provider create --name claude --type claude --from-existing
189+
openshell provider list
190+
```
191+
192+
## Create a Sandbox with Kata
193+
194+
The `runtime_class_name` field is fully supported in the gRPC API and Kubernetes driver but is not yet exposed as a CLI flag. Use the Python SDK script from the example:
195+
196+
```shell
197+
uv run examples/kata-containers/create-kata-sandbox.py \
198+
--name my-claude \
199+
--image myregistry.com/claude-sandbox:latest \
200+
--runtime-class kata-containers \
201+
--provider claude
202+
```
203+
204+
Or create via CLI and patch afterward:
205+
206+
```shell
207+
openshell sandbox create --name my-claude \
208+
--from myregistry.com/claude-sandbox:latest \
209+
--provider claude \
210+
--policy examples/kata-containers/policy-claude-code.yaml
211+
212+
kubectl -n openshell patch sandbox my-claude --type=merge -p '{
213+
"spec": {
214+
"podTemplate": {
215+
"spec": {
216+
"runtimeClassName": "kata-containers"
217+
}
218+
}
219+
}
220+
}'
221+
```
222+
223+
## Apply a Network Policy
224+
225+
Apply the Claude Code policy that allows Anthropic, GitHub, npm, and PyPI:
226+
227+
```shell
228+
openshell policy set my-claude \
229+
--policy examples/kata-containers/policy-claude-code.yaml \
230+
--wait
231+
```
232+
233+
Verify:
234+
235+
```shell
236+
openshell policy get my-claude --full
237+
```
238+
239+
## Connect and Run Your Agent
240+
241+
```shell
242+
openshell sandbox connect my-claude
243+
```
244+
245+
Inside the sandbox, start Claude Code:
246+
247+
```shell
248+
claude
249+
```
250+
251+
## Verify Kata Isolation
252+
253+
Confirm the pod is running with the Kata runtime:
254+
255+
```shell
256+
kubectl -n openshell get pods -l sandbox=my-claude \
257+
-o jsonpath='{.items[0].spec.runtimeClassName}'
258+
```
259+
260+
The output should be `kata-containers`.
261+
262+
Check the guest kernel version (it should differ from the host):
263+
264+
```shell
265+
openshell sandbox exec my-claude -- uname -r
266+
```
267+
268+
Verify OpenShell sandbox isolation inside the VM:
269+
270+
```shell
271+
openshell sandbox exec my-claude -- ip netns list
272+
openshell sandbox exec my-claude -- ss -tlnp | grep 3128
273+
openshell sandbox exec my-claude -- touch /usr/test-file
274+
```
275+
276+
The network namespace should be listed, the proxy should be listening on port 3128, and writing to `/usr` should fail with "Permission denied".
277+
278+
</Steps>
279+
280+
## Kata-Specific Considerations
281+
282+
### Guest Kernel Requirements
283+
284+
The supervisor requires Landlock (ABI V2, kernel 5.19+), seccomp-BPF, network namespaces, veth, and iptables inside the VM. Most Kata guest kernels 5.15+ include these. If Landlock is unavailable, set `landlock.compatibility: best_effort` in the policy.
285+
286+
### hostPath Volume Passthrough
287+
288+
The supervisor binary is injected via a hostPath volume. Kata passes hostPath volumes into the VM via virtiofs or 9p. This works with standard Kata configurations. If your setup restricts host filesystem access, allow `/opt/openshell/bin` in the Kata `configuration.toml`.
289+
290+
### Container Capabilities
291+
292+
The OpenShell K8s driver automatically requests `SYS_ADMIN`, `NET_ADMIN`, `SYS_PTRACE`, and `SYSLOG`. Inside a Kata VM these capabilities are scoped to the guest kernel, not the host.
293+
294+
### Performance
295+
296+
Kata adds 2-5 seconds of VM boot time. Runtime overhead is minimal for IO-bound AI agent workloads. Account for the base VM memory overhead (128-256 MB) in pod resource requests.
297+
298+
## Cleanup
299+
300+
```shell
301+
openshell sandbox delete my-claude
302+
openshell provider delete claude
303+
kubectl delete -f examples/kata-containers/supervisor-daemonset.yaml
304+
helm uninstall openshell -n openshell
305+
kubectl delete -f https://raw.githubusercontent.com/NVIDIA/OpenShell/main/deploy/kube/manifests/agent-sandbox.yaml
306+
kubectl delete namespace openshell
307+
```
308+
309+
## Next Steps
310+
311+
- Explore the [example policies](https://github.com/NVIDIA/OpenShell/tree/main/examples/kata-containers) for minimal, L7, and full agent configurations.
312+
- Add more providers for multi-agent setups. See [Manage Providers](/sandboxes/manage-providers).
313+
- Configure [Inference Routing](/inference/about) to route model requests through local or remote LLM backends.
314+
- Review the [Policy Schema Reference](/reference/policy-schema) for the full YAML specification.
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
# Sandbox image for Claude Code.
5+
# Build: docker build -t claude-sandbox:latest -f Dockerfile.claude-code .
6+
# Usage: openshell sandbox create --from claude-sandbox:latest -- claude
7+
8+
FROM node:22-slim
9+
10+
RUN apt-get update && apt-get install -y --no-install-recommends \
11+
curl iproute2 iptables git openssh-client ca-certificates \
12+
&& rm -rf /var/lib/apt/lists/*
13+
14+
RUN npm install -g @anthropic-ai/claude-code
15+
16+
RUN groupadd -g 1000 sandbox && \
17+
useradd -m -u 1000 -g sandbox -s /bin/bash sandbox
18+
19+
WORKDIR /sandbox
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
# Sandbox image for Python-based AI agents (Hermes, custom agents, etc.).
5+
# Build: docker build -t python-agent-sandbox:latest -f Dockerfile.python-agent .
6+
# Usage: openshell sandbox create --from python-agent-sandbox:latest -- python -m my_agent
7+
8+
FROM python:3.13-slim
9+
10+
RUN apt-get update && apt-get install -y --no-install-recommends \
11+
curl iproute2 iptables git openssh-client ca-certificates \
12+
&& rm -rf /var/lib/apt/lists/*
13+
14+
# Install your agent here. Replace with your agent's package name.
15+
# RUN pip install --no-cache-dir hermes-ai
16+
17+
RUN groupadd -g 1000 sandbox && \
18+
useradd -m -u 1000 -g sandbox -s /bin/bash sandbox
19+
20+
WORKDIR /sandbox

0 commit comments

Comments
 (0)