@@ -12,29 +12,45 @@ This is a Kubernetes GitOps repository for a personal homelab cluster managed wi
1212- ** GitOps** : FluxCD with Flux Operator for declarative cluster management
1313- ** Container Runtime** : containerd
1414- ** Networking** : Cilium CNI with Istio service mesh
15- - ** Storage** : OpenEBS for container-attached storage
16- - ** Monitoring** : Prometheus, Grafana, Loki, Jaeger for observability
17- - ** Security** : Kyverno for policy management, Falco for runtime security
15+ - ** Storage** : Rook-Ceph, OpenEBS, democratic-csi for container-attached storage
16+ - ** Monitoring** : Prometheus, Grafana, Loki, Jaeger, Thanos for observability
17+ - ** Security** : Kyverno, OPA Gatekeeper for policy management, Falco & Tetragon for runtime security
1818- ** Load Balancing** : MetalLB for bare metal load balancing
19+ - ** Chaos Engineering** : Litmus for chaos testing
1920
2021## Directory Structure
2122
2223```
23- ├── kubernetes/ # Kubernetes manifests and configurations
24- │ ├── apps/ # Application deployments (base + overlays)
25- │ ├── bootstrap/ # Initial cluster bootstrap configuration
26- │ ├── clusters/ # Per-cluster configurations
27- │ ├── components/ # Shared components and alerts
28- │ └── tenants/ # Multi-tenant configurations
29- ├── talos/ # Talos Linux configuration files
30- │ ├── generated/ # Generated Talos configs (encrypted)
31- │ ├── integrations/ # Integration configurations
32- │ └── patches/ # Talos configuration patches
33- ├── terraform/ # Infrastructure as Code
34- │ ├── cloudflare/ # Cloudflare DNS/CDN configuration
35- │ └── gcp/ # Google Cloud Platform resources
36- ├── .taskfiles/ # Task automation definitions
37- └── docs/ # Documentation
24+ ├── kubernetes/ # Kubernetes manifests and configurations
25+ │ ├── apps/
26+ │ │ ├── base/ # Base application configurations (DRY principle)
27+ │ │ │ └── [system-name]/ # e.g., observability, kube-system, home-system
28+ │ │ │ ├── [app-name]/
29+ │ │ │ │ ├── app/ # HelmRelease, OCIRepository, secrets, values
30+ │ │ │ │ └── ks.yaml # Flux Kustomization with dependencies
31+ │ │ │ ├── namespace.yaml
32+ │ │ │ └── kustomization.yaml
33+ │ │ └── overlays/
34+ │ │ └── cluster-00/ # Cluster-specific overrides
35+ │ ├── bootstrap/
36+ │ │ └── helmfile.yaml # Bootstrap Flux Operator and dependencies
37+ │ ├── clusters/
38+ │ │ └── cluster-00/
39+ │ │ ├── flux-system/ # Flux Operator and FluxInstance configs
40+ │ │ ├── secrets/ # Cluster secrets (SOPS encrypted)
41+ │ │ └── ks.yaml # Root Kustomization
42+ │ ├── components/
43+ │ │ └── common/alerts/ # Shared monitoring alerts
44+ │ └── tenants/ # Multi-tenant configurations
45+ ├── talos/ # Talos Linux configuration files
46+ │ ├── generated/ # Generated Talos configs (encrypted)
47+ │ ├── integrations/ # Cilium, cert-approver integrations
48+ │ └── patches/ # iSCSI, metrics patches
49+ ├── terraform/ # Infrastructure as Code
50+ │ ├── cloudflare/ # Cloudflare DNS/CDN configuration
51+ │ └── gcp/ # GCP KMS, Thanos storage, Velero backups
52+ ├── .taskfiles/ # Task automation definitions
53+ └── docs/ # Documentation
3854```
3955
4056## Common Commands
@@ -43,17 +59,29 @@ This is a Kubernetes GitOps repository for a personal homelab cluster managed wi
4359The repository uses [ Task] ( https://taskfile.dev ) for automation. All commands should be run via ` task ` :
4460
4561``` bash
46- # Core FluxCD operations
47- task flux:bootstrap # Bootstrap FluxCD in the cluster
48- task flux:secrets # Install cluster secrets and configs
62+ # FluxCD Operations
63+ task flux:bootstrap # Bootstrap Flux Operator via Helmfile
64+ task flux:secrets # Install cluster secrets (SOPS decrypt + apply)
65+ task fluxcd:bootstrap # Alternative bootstrap path
66+ task fluxcd:diff # Preview FluxCD operator changes
4967
50- # Talos operations
51- task talos:config # Decrypt and load Talos config
68+ # Talos Operations
69+ task talos:config # Decrypt and load talosconfig to ~/.talos/config
70+
71+ # Core Operations
72+ task core:gpg # Import SOPS PGP keys
73+ task core:lint # Run yamllint
5274
5375# View available tasks
5476task --list
5577```
5678
79+ ** Important Variables:**
80+ - ` CLUSTER ` : cluster-00 (default cluster ID)
81+ - ` GITHUB_USER ` : xunholy
82+ - ` GITHUB_REPO ` : k8s-gitops
83+ - ` GITHUB_BRANCH ` : main
84+
5785### Pre-commit Hooks
5886The repository uses pre-commit for code quality:
5987``` bash
@@ -67,72 +95,200 @@ Active hooks include:
6795- Trailing whitespace and EOF fixes
6896
6997### Secret Management
70- Secrets are encrypted using [ SOPS] ( https://github.com/mozilla/sops ) :
98+ Secrets are encrypted using [ SOPS] ( https://github.com/mozilla/sops ) with dual encryption (PGP + GCP KMS) :
7199``` bash
72- # Decrypt secrets (requires proper age key setup )
73- sops -d path/to/encrypted .yaml
100+ # Edit encrypted files (automatically decrypts/encrypts )
101+ sops path/to/file.enc .yaml
74102
75- # Edit encrypted files
76- sops path/to/encrypted .yaml
103+ # Decrypt for viewing only
104+ sops -d path/to/file.enc .yaml
77105```
78106
107+ ** SOPS Configuration:**
108+ - ** PGP Key** : ` 0635B8D34037A9453003FB7B93CAA682FF4C9014 `
109+ - ** Age Key** : ` age19gj66fq5v2veu940ftyj4pkw0w5tgxgddlyqnd00pnjzyndevurqx70g4t `
110+ - ** GCP KMS** : Used for stored PGP keys
111+ - Encrypted files use ` .enc.yaml ` or ` .enc.age.yaml ` suffix
112+
79113## Key Technologies & Patterns
80114
81115### GitOps with FluxCD
82- - ** Flux Operator** : Manages FluxCD installation via FluxInstance CRDs
83- - ** Kustomizations** : Define how to apply Kubernetes manifests
84- - ** HelmReleases** : Manage Helm chart deployments
85- - ** GitRepository/OCIRepository** : Source definitions for manifests
116+ This repository uses ** Flux Operator** instead of traditional ` flux bootstrap ` :
117+ - ** FluxInstance CRDs** : Declaratively manage FluxCD components
118+ - ** OCIRepository** : Used for Helm charts instead of HelmRepository (e.g., ` oci://ghcr.io/prometheus-community/charts ` )
119+ - ** Kustomizations** : Define manifest application with SOPS decryption, post-build substitution, and dependency chains
120+ - ** HelmReleases** : Reference charts via ` chartRef ` pointing to OCIRepository
121+ - ** Root Kustomization** : Located at ` kubernetes/clusters/cluster-00/ks.yaml `
122+
123+ ### Application Deployment Pattern
124+ Each application follows this structure:
125+ 1 . ** Base configuration** in ` kubernetes/apps/base/[system-name]/[app-name]/ ` :
126+ - ` app/helmrelease.yaml ` : Helm release definition
127+ - ` app/ocirepository.yaml ` : Chart source
128+ - ` app/secret.enc.yaml ` : Encrypted secrets
129+ - ` app/values.yaml ` : Helm values
130+ - ` ks.yaml ` : Flux Kustomization with ` dependsOn ` , SOPS settings, substitutions
131+
132+ 2 . ** Cluster overlays** in ` kubernetes/apps/overlays/cluster-00/ ` : Cluster-specific customizations using Kustomize patches
86133
87- ### Cluster Configuration
88- - ** Bootstrap** : Initial cluster setup in ` kubernetes/bootstrap/ `
89- - ** Apps** : Application deployments with base configurations and cluster-specific overlays
90- - ** Components** : Shared components like monitoring alerts
91- - ** Tenants** : Multi-tenant namespace configurations
134+ 3 . ** System categories** : Apps organized into logical systems:
135+ - ` kube-system ` : Core Kubernetes (Cilium, metrics-server, reflector)
136+ - ` network-system ` : Networking (cert-manager, external-dns, oauth2-proxy, dex)
137+ - ` observability ` : Monitoring (Prometheus, Grafana, Loki, Jaeger, Thanos)
138+ - ` security-system ` : Security (Kyverno, Falco, Gatekeeper, Crowdsec)
139+ - ` istio-system ` & ` istio-ingress ` : Service mesh
140+ - ` home-system ` : Home automation & media
141+ - ` rook-ceph ` : Storage
142+
143+ ### HelmRelease Global Defaults
144+ All HelmReleases are patched with these defaults via Kustomization:
145+ ``` yaml
146+ install :
147+ crds : CreateReplace
148+ createNamespace : true
149+ replace : true
150+ strategy : RetryOnFailure
151+ timeout : 10m
152+ rollback :
153+ recreate : true
154+ force : true
155+ cleanupOnFail : true
156+ upgrade :
157+ cleanupOnFail : true
158+ crds : CreateReplace
159+ remediation :
160+ remediateLastFailure : true
161+ retries : 3
162+ strategy : rollback
163+ ` ` `
92164
93165### Security Practices
94- - All secrets encrypted with SOPS using age encryption
95- - Kyverno policies enforce security standards
96- - Falco provides runtime security monitoring
97- - Talos Linux provides immutable, minimal attack surface
166+ - **Dual encryption**: SOPS with PGP (primary) + GCP KMS backup
167+ - **Never commit unencrypted secrets**: All secrets use ` .enc.yaml` suffix
168+ - **Policy enforcement**: Kyverno & OPA Gatekeeper
169+ - **Runtime security**: Falco & Tetragon
170+ - **Pod security labels**: Applied to all namespaces
171+ - **Immutable OS**: Talos Linux minimal attack surface
98172
99173# # Development Workflow
100174
101- 1 . ** Making Changes** :
102- - Edit YAML manifests in appropriate directories
103- - Ensure proper directory structure (base + overlays pattern)
104- - Follow existing naming conventions
175+ # ## Bootstrap New Cluster
176+ ` ` ` bash
177+ # 1. Set environment variables (CLUSTER_ID defaults to cluster-00)
178+ # 2. Bootstrap Flux Operator
179+ task fluxcd:bootstrap # Installs flux-operator, flux-instance, cert-manager, kustomize-mutating-webhook
180+
181+ # 3. Install cluster secrets
182+ task flux:secrets # Decrypts and applies sops-gpg, sops-age, cluster-secrets, github-auth, cluster-config
183+
184+ # 4. Configure Talos
185+ task talos:config # Decrypts talosconfig to ~/.talos/config
186+ ` ` `
187+
188+ # ## Making Changes to Applications
189+ 1. **Edit base configuration** in `kubernetes/apps/base/[system-name]/[app-name]/`
190+ 2. **Use overlays** for cluster-specific customization in `kubernetes/apps/overlays/cluster-00/`
191+ 3. **Follow naming conventions** :
192+ - `ks.yaml` : Flux Kustomization resources
193+ - `kustomization.yaml` : Kustomize configuration
194+ - `*.enc.yaml` : SOPS encrypted files
195+ - `helmrelease.yaml` : Helm release definitions
196+ - `ocirepository.yaml` : OCI repository sources
197+ 4. **Ensure secrets are encrypted** before committing (use `sops` command)
198+ 5. **Run pre-commit hooks** : ` pre-commit run --all-files`
199+ 6. **FluxCD auto-reconciles** from main branch after push
200+
201+ # ## Adding New Applications
202+ 1. Create directory structure : ` kubernetes/apps/base/[system-name]/[app-name]/`
203+ 2. Add `app/` directory with :
204+ - ` helmrelease.yaml` (with `chartRef` to OCIRepository)
205+ - ` ocirepository.yaml` (chart source)
206+ - ` values.yaml` (Helm values)
207+ - ` secret.enc.yaml` (if needed, encrypted with SOPS)
208+ - ` kustomization.yaml`
209+ 3. Create `ks.yaml` with :
210+ - ` dependsOn` for dependency chain
211+ - ` decryption` for SOPS secrets
212+ - ` postBuild.substituteFrom` for ConfigMap/Secret references
213+ 4. Add to parent `kustomization.yaml`
214+ 5. Create overlay if cluster-specific customization needed
105215
106- 2 . ** Testing** :
107- - Use ` task ` commands to validate configurations
108- - Run pre-commit hooks before committing
109- - FluxCD will automatically reconcile changes after push
216+ # # Important Patterns & Conventions
110217
111- 3 . ** Secrets Management** :
112- - Never commit unencrypted secrets
113- - Use SOPS for any sensitive data
114- - Reference encrypted secrets in ` .sops.yaml `
218+ # ## File Naming
219+ - `ks.yaml` : Flux Kustomization resources (defines how to apply manifests)
220+ - `kustomization.yaml` : Kustomize configuration (defines what resources to include)
221+ - `*.enc.yaml` : SOPS-encrypted with PGP
222+ - `*.enc.age.yaml` : SOPS-encrypted with Age
223+ - `helmfile.yaml` : Helmfile configurations (used in bootstrap)
224+ - `helmrelease.yaml` : Helm release definitions
225+ - `ocirepository.yaml` : OCI repository sources for Helm charts
226+ - `namespace.yaml` : Namespace definitions with pod security labels
115227
116- ## File Patterns to Understand
228+ # ## Kustomization Labels
229+ - `substitution.flux/enabled=true` : Enables SOPS decryption and variable substitution
230+ - Patches applied globally to all Kustomizations for HelmRelease defaults
117231
118- - ` kustomization.yaml ` : Kustomize configuration files
119- - ` *.enc.yaml ` : SOPS-encrypted files
120- - ` helmfile.yaml ` : Helmfile configurations for chart management
121- - ` app/ ` : Directory containing application-specific configurations
122- - ` resources/ ` : Directory for Kubernetes resource definitions
232+ # ## Namespace Conventions
233+ Labels applied to namespaces :
234+ - `pod-security.kubernetes.io/enforce : privileged` (or `restricted`/`baseline`)
235+ - `goldilocks.fairwinds.com/enabled : " true" ` (monitoring)
236+ - ` kustomize.toolkit.fluxcd.io/prune: disabled` (on flux-system)
237+
238+ # ## Dependency Management
239+ Flux Kustomizations use `dependsOn` to establish deployment order :
240+ ` ` ` yaml
241+ dependsOn:
242+ - name: cert-manager
243+ namespace: flux-system
244+ ` ` `
123245
124246# # Important Notes
125247
126- - The cluster uses cluster ID "cluster-00" as the default
127- - Talos config is stored encrypted in ` talos/generated/ `
128- - FluxCD manages all application deployments automatically
129- - Changes to ` main ` branch trigger automatic reconciliation
130- - The repository follows enterprise GitOps patterns suitable for production use
248+ - **Cluster ID**: "cluster-00" is the default cluster identifier
249+ - **Branch**: `main` is the primary branch (auto-reconciled by FluxCD)
250+ - **Talos configs**: Stored encrypted in `talos/generated/`
251+ - **Bootstrap method**: Uses Flux Operator (not traditional `flux bootstrap`)
252+ - **Chart sources**: Uses OCIRepository instead of HelmRepository
253+ - **Yamllint config**: Line length warning at 240 characters, 2-space indentation
254+ - **Renovate automation**: Auto-merge enabled for digests, ignores encrypted files
255+ - **Multi-cluster ready**: Designed with overlay pattern for multiple clusters
256+ - **Enterprise patterns**: Production-grade GitOps implementation showcasing CNCF ecosystem
131257
132258# # External Dependencies
133259
134- - ** Cloudflare** : DNS and CDN services
135- - ** Google Cloud Platform** : OAuth, backup storage
136- - ** GitHub** : Source control and authentication
137- - ** SOPS/age** : Secret encryption (requires age key setup)
260+ - **Cloudflare**: DNS management and CDN services
261+ - **Google Cloud Platform**:
262+ - GCP KMS for SOPS encryption
263+ - Google Cloud Storage for Thanos long-term metrics storage
264+ - Google Cloud Storage for Velero backups
265+ - OAuth for authentication
266+ - **GitHub**: Source control, authentication, and OCI registry for Helm charts
267+ - **SOPS/age**: Secret encryption (requires PGP and/or age key setup)
138268- **Task**: Task runner (must be installed locally)
269+ - **Helmfile**: Used for bootstrap process
270+ - **Let's Encrypt**: Certificate generation for secure communication
271+ - **NextDNS**: Malware protection and ad-blocking
272+ - **UptimeRobot**: Service monitoring
273+
274+ # # Troubleshooting with Flux MCP
275+
276+ This repository includes Cursor rules for troubleshooting Flux resources using the `flux-operator-mcp` tools. Key troubleshooting workflows :
277+
278+ # ## Analyzing HelmReleases
279+ 1. Check helm-controller status with `get_flux_instance`
280+ 2. Get HelmRelease resource and analyze spec, status, inventory, events
281+ 3. Check `valuesFrom` ConfigMaps and Secrets
282+ 4. Verify source (OCIRepository) status
283+ 5. Analyze managed resources from inventory
284+ 6. Check logs if resources are failing
285+
286+ # ## Analyzing Kustomizations
287+ 1. Check kustomize-controller status with `get_flux_instance`
288+ 2. Get Kustomization resource and analyze spec, status, inventory, events
289+ 3. Check `substituteFrom` ConfigMaps and Secrets
290+ 4. Verify source (GitRepository/OCIRepository) status
291+ 5. Analyze managed resources from inventory
292+
293+ # ## Comparing Resources Across Clusters
294+ Use `get_kubernetes_contexts` and `set_kubernetes_context` to switch between clusters, then compare resource specs and status.
0 commit comments