Fix deliberately broken cloud network infrastructure. Learn by troubleshooting real incidents.
| Provider | Status | Guide |
|---|---|---|
| Azure | ✅ Available | azure/README.md |
| AWS | ✅ Available | aws/README.md |
| GCP | ✅ Available | gcp/README.md |
- Routing & Gateways — NAT gateways, route tables, internet egress
- DNS Resolution — Private DNS zones, service discovery
- Network Security — Security groups, firewall rules, subnet isolation
- Troubleshooting — Real-world diagnostic techniques
- Deploy — Run the setup script. Infrastructure deploys with intentional misconfigurations.
- Read the incidents — Ticket descriptions tell you the symptoms. Your job is to find the root cause.
- Diagnose — SSH through the bastion host into VMs to test connectivity, check DNS, inspect services, etc.
- Fix — From your local terminal, use the cloud provider CLI (
az,aws,gcloud) to fix the misconfigured cloud resources (routes, firewall rules, DNS records, etc.). You are not editing Terraform or fixing things from inside the VMs. - Validate — Run the validation script from your local machine. It SSHes into the VMs and runs real connectivity checks.
Please use GitHub Issues for bugs, broken instructions, or unclear steps:
- Open an issue: GitHub Issues
- Include: cloud/provider, which incident/step you’re on, what you expected vs what happened, and the output of the validation script (redact secrets/tokens).
~$0.50–1.00 per session. Always destroy resources when done.
The infrastructure is intentionally misconfigured — that is the point of the lab. Students fix issues using the cloud provider CLI (az, aws, gcloud), not by editing Terraform. When contributing, do not "fix" broken resources in the Terraform code. If you discover a teardown issue, the right place to address it is in the provider's destroy.sh script or in a README troubleshooting note, not by modifying the Terraform modules.