Skip to content

Conversation

@CasLubbers
Copy link
Contributor

@CasLubbers CasLubbers commented Oct 30, 2025

📌 Summary

Updated kubectl apply with --server-side to prevent errors and warnings when applying essential manifets.


Adds a new collect traces command that generates a comprehensive report of unhealthy Kubernetes resources during APL installation failures. The report is automatically stored in a ConfigMap (apl-operator/apl-traces-report) for post-mortem analysis.

  • Monitors 8 resource types: Pods, Deployments, StatefulSets, Nodes, Services, PVCs, PVs, and ArgoCD Applications
  • Automatically runs when install or apply commands fail
  • Can be invoked manually: binzx/otomi collect traces
  • Stores structured JSON report with timestamp and summary statistic

🔍 Reviewer Notes

  • Error handling: ArgoCD checks gracefully skip if CRD doesn't exist (404/403 errors)
  • Non-blocking: Troubleshooting errors are caught and logged but don't prevent retries
  • ConfigMap storage: Uses create-or-update pattern from existing operator patterns
  • Minimal output: Console shows only count; full report stored in ConfigMap
  • Integration points: Hooked into retry catch blocks in install.ts:149-154 and apply.ts:90-95

🧹 Checklist

  • Code is readable, maintainable, and robust.
  • Unit tests added/updated

@Ani1357 Ani1357 self-requested a review October 31, 2025 08:03
@CasLubbers CasLubbers changed the title feat: add troubleshoot command feat: add collect traces command Nov 4, 2025
@j-zimnowoda j-zimnowoda merged commit f28e76f into main Nov 5, 2025
12 checks passed
@j-zimnowoda j-zimnowoda deleted the APL-570 branch November 5, 2025 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants