Skip to content

Latest commit

Β 

History

History
447 lines (334 loc) Β· 10.3 KB

File metadata and controls

447 lines (334 loc) Β· 10.3 KB

Security Architecture

Charon implements a defense-in-depth security model with multiple layers of protection for infrastructure services.

Security Principles

  1. VPN-First Architecture - All services default to VPN-only access
  2. Least Privilege - Minimal RBAC permissions for each component
  3. Secrets Isolation - Kubernetes Secrets for sensitive data
  4. TLS Everywhere - End-to-end encryption in transit
  5. Input Validation - Sanitize all user inputs and external data

Secret Management

File-Based Secrets (Development)

Never commit secrets to git:

# These files MUST be in .gitignore
.env
terraform.tfvars
*.tfvars (except *.tfvars.example)

Proper file permissions:

# Restrict access to owner only
chmod 600 .env
chmod 600 terraform/terraform.tfvars

# Verify permissions
ls -la .env terraform/terraform.tfvars
# Should show: -rw------- (600)

Template pattern:

Always provide example files without real secrets:

# terraform.tfvars.example
cloudflare_api_token = "your-cloudflare-token-here"
domain_name          = "example.org"

Kubernetes Secrets (Production)

All sensitive data in Kubernetes uses native Secrets:

apiVersion: v1
kind: Secret
metadata:
  name: service-credentials
  namespace: core
type: Opaque
stringData:
  username: "admin"
  password: "secure-password-here"

Accessing secrets in pods:

env:
  - name: DB_PASSWORD
    valueFrom:
      secretKeyRef:
        name: service-credentials
        key: password

Secret Rotation

When to rotate:

  • On compromise or suspected exposure
  • On team member departure
  • Periodically (90 days recommended)
  • After git history exposure (even if repo is private)

What to rotate:

  • Database passwords
  • API tokens (GitHub, Cloudflare, OpenAI, etc.)
  • Admin passwords (FreeIPA, Grafana, etc.)
  • TLS certificates (if not automated)

Input Validation

Code Injection Prevention

Problem: Unsanitized inputs in f-strings or subprocess calls can lead to code injection.

Bad Example:

# VULNERABLE - username can contain malicious code
script = f"""
user = User.objects.create(username='{username}')
"""
subprocess.run(["/usr/bin/python", "-c", script])

Good Example:

import re
import shlex

# Validate username format
if not re.match(r'^[a-zA-Z0-9_-]+$', username):
    raise ValueError(f"Invalid username: {username}")

# Or escape for shell
safe_username = shlex.quote(username)

# Better: Use parameterized approach
subprocess.run([
    "/usr/bin/python",
    "/path/to/script.py",
    "--username", username  # Passed as argument, not embedded
])

Validation patterns:

# Username: alphanumeric, underscore, hyphen
USERNAME_PATTERN = r'^[a-zA-Z0-9_-]+$'

# Email
EMAIL_PATTERN = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'

# Domain
DOMAIN_PATTERN = r'^[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(\.[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$'

# IPv4
IPV4_PATTERN = r'^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$'

SQL Injection Prevention

Always use parameterized queries:

# BAD
cursor.execute(f"SELECT * FROM users WHERE username = '{username}'")

# GOOD
cursor.execute("SELECT * FROM users WHERE username = %s", (username,))

Network Security

VPN-First Architecture

Headscale VPN: All services default to VPN-only access via Headscale/Tailscale.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Internet (Public)                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚
             β”‚ Encrypted WireGuard Tunnel
             ↓
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚   Headscale    β”‚  VPN Control Server
    β”‚  (core ns)     β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  Service Mesh (VPN Only)   β”‚
    β”‚                            β”‚
    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”       β”‚
    β”‚  β”‚NetBoxβ”‚  β”‚Grafanaβ”‚ ...  β”‚
    β”‚  β””β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”˜       β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Benefits:

  • Services invisible to public internet
  • Encrypted peer-to-peer connections
  • Centralized access control
  • MagicDNS for service discovery

TLS Termination

Architecture:

Internet β†’ Ingress β†’ NGINX-TLS Sidecar β†’ Service Container
   HTTPS      β”‚           HTTPS              HTTP
              β”‚
              β””β†’ Certificate from cert-manager

TLS handled by:

  • cert-manager - Automated Let's Encrypt certificates
  • NGINX sidecar - TLS termination in each pod
  • Service mesh - Internal pod-to-pod can be HTTP (within cluster network)

Why internal HTTP is acceptable:

# ArgoCD example - runs with --insecure internally
args: ["--insecure"]  # NGINX handles TLS
  • Traffic never leaves cluster network
  • Kubernetes network policies enforce isolation
  • NGINX sidecar encrypts external traffic
  • Reduces complexity and certificate management burden

When to use end-to-end TLS:

  • Multi-cluster deployments
  • Untrusted network environments
  • Compliance requirements (HIPAA, PCI-DSS)

Ingress Security

IP Allowlisting:

Only services that should be public get Ingress:

metadata:
  annotations:
    nginx.ingress.kubernetes.io/whitelist-source-range: "10.0.0.0/8,100.64.0.0/10"

Default access:

  • Most services: VPN-only (no public Ingress)
  • Public services: Ingress with Cloudflare proxy (DDoS protection)
  • Admin interfaces: VPN + IP allowlist

RBAC & Least Privilege

Service Accounts

Each service gets minimal permissions:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: netbox
  namespace: core
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: netbox-headscale-exec
  namespace: core
rules:
  - apiGroups: [""]
    resources: ["pods/exec"]
    resourceNames: ["headscale-0"]  # Only specific pod
    verbs: ["create"]

Principles:

  • One ServiceAccount per service
  • Roles scoped to namespace
  • Specific resource names when possible
  • No cluster-wide permissions unless required

Privileged Containers

When required:

  • FreeIPA: Needs CAP_SYS_ADMIN for systemd
  • Buildah image builders: Need CAP_SETUID for rootless builds
  • Headscale: Network configuration

Security measures:

securityContext:
  privileged: true
  capabilities:
    add:
      - SYS_ADMIN  # Specific capability only
    drop:
      - ALL        # Drop everything else

Avoid when possible:

  • Use rootless containers
  • Drop all capabilities by default
  • Add only specific capabilities needed

Encryption

At Rest

Kubernetes Secrets:

  • Stored encrypted in etcd (if etcd encryption enabled)
  • Mounted as tmpfs in pods (memory only, not disk)

Persistent Volumes:

  • Block storage: Provider encryption (Linode, AWS, etc.)
  • Application-level: Database encryption (PostgreSQL TDE)

In Transit

External:

  • TLS 1.2+ for all ingress traffic
  • Let's Encrypt certificates (automated renewal)

Internal:

  • Pod-to-pod: Kubernetes internal network (encrypted at cluster level if CNI supports)
  • VPN mesh: WireGuard encryption for all VPN traffic

Security Scanning

Pre-commit Hooks

Automated security checks before commit:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/bridgecrewio/checkov
    hooks:
      - id: checkov
        args: ['--quiet', '--compact']

Checks:

  • Terraform security issues
  • Kubernetes misconfigurations
  • Secret exposure detection
  • Container security

Container Image Scanning

Best practices:

  • Use official base images
  • Pin image versions (avoid :latest in production)
  • Regular updates for security patches
  • Scan with Trivy or similar tools

Threat Model

Assumed Threats

  1. External Attackers

    • Cannot access services without VPN
    • Must compromise Headscale first
    • Public ingress is minimal attack surface
  2. Compromised Node

    • Namespace isolation limits blast radius
    • RBAC prevents lateral movement
    • Secrets isolated per service
  3. Insider Threats

    • Audit logging (planned)
    • RBAC limits access
    • Secret rotation on team changes

Out of Scope

  • Physical security of nodes
  • Kubernetes control plane compromise (managed provider responsibility)
  • Side-channel attacks

Best Practices Checklist

Secrets:

  • .env and terraform.tfvars in .gitignore
  • File permissions set to 600
  • No secrets committed to git history
  • Secrets rotated on exposure

Network:

  • Services default to VPN-only
  • TLS certificates automated
  • IP allowlisting on public ingress
  • Network policies defined

RBAC:

  • ServiceAccount per service
  • Minimal role permissions
  • Namespace isolation enforced
  • No cluster-admin unless required

Code:

  • Input validation on all external data
  • Parameterized queries only
  • No f-string injection risks
  • Pre-commit hooks enabled

Images:

  • Official base images only
  • Versions pinned
  • Regular security updates
  • Vulnerability scanning

Security Incident Response

On credential exposure:

  1. Rotate compromised credentials immediately
  2. Audit access logs for unauthorized use
  3. Review git history for committed secrets
  4. Purge git history if secrets were committed
  5. Force password reset for affected users

On container compromise:

  1. Delete affected pod immediately
  2. Review audit logs
  3. Check for lateral movement
  4. Rotate service credentials
  5. Update to patched image version

Related Documentation


Navigation: Documentation Index | Home