Richer core dumps: structured errors, sync_steps, LLM trace, ANSI strip#712
Richer core dumps: structured errors, sync_steps, LLM trace, ANSI strip#712vishalramvelu wants to merge 9 commits intopromptdriven:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves PDD “core dump” debug snapshots to make sync failures diagnosable from the JSON alone (structured non-exception errors, per-operation sync steps derived from meta logs, failure-only LLM/test traces, and ANSI/OSC stripping for captured terminal output).
Changes:
- Add structured core-dump errors via
record_core_dump_errorand record logical/non-exception failure paths in sync. - Bump core dump schema to v2 and enrich dumps with auto-included meta artifacts plus derived
sync_steps. - Capture failure-only debugging context (ANSI/OSC-clean terminal output, truncated test output excerpts, and last LLM prompt/response pair).
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
pdd/core/errors.py |
Adds structured core-dump error recording API. |
pdd/core/dump.py |
Bumps schema to v2; auto-includes meta sync/run files; derives sync_steps; ensures steps[*].model defaults to "unknown". |
pdd/core/cli.py |
Expands ANSI stripping to cover CSI + OSC sequences. |
pdd/core/llm_trace.py |
Introduces lightweight, redacted/truncated LLM prompt/response trace capture by operation. |
pdd/llm_invoke.py |
Records best-effort LLM traces for cloud + LiteLLM paths. |
pdd/sync_orchestration.py |
Records logical failures as structured core-dump errors; captures truncated test output excerpts; attaches failure-only LLM traces to operation log entries. |
pdd/sync_main.py |
Records budget exhaustion as a structured core-dump error. |
tests/test_core_dump.py |
Updates expectations for schema v2; adds tests for meta sync/run auto-inclusion, derived sync_steps, and model defaulting. |
tests/core/test_cli.py |
Adds tests covering OSC/cursor-sequence stripping. |
tests/test_core_errors.py |
Adds unit test for structured error recording. |
tests/test_sync_orchestration.py |
Adds tests validating logical-failure error recording, test output excerpt truncation, and LLM trace attachment. |
pdd/simple_math.py |
Adds a new module (appears unrelated to the PR’s stated scope). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
target 3/24 |
|
Units Tests failing because TestConvergencePromptRequirements expects literal substrings in agentic_e2e_fix_orchestrator_python.prompt, but updating that prompt caused a merge conflict with main on the same file. Leaving the prompt as-is for now. We can align prompt text with Issue #903 tests in a follow-up once the branch is rebased cleanly or the file is de-conflicted upstream. |
|
Hey @vishalramvelu — CI is failing because the PR's version of The two failing tests:
These requirements exist in upstream |
…op unused json import
4bdbb6a to
413cd47
Compare
|
target 3/31 |
Made-with: Cursor # Conflicts: # pdd/prompts/agentic_bug_step11_e2e_test_LLM.prompt # pdd/prompts/agentic_e2e_fix_orchestrator_python.prompt # tests/test_issue_633_reproduction.py
…hers Made-with: Cursor
gltanaka
left a comment
There was a problem hiding this comment.
Hey @vishalramvelu — the core changes here look solid and well-tested. Just 3 files that need to be removed before we can merge:
-
pdd/core_dump_smoke.py— PDD-generated test file thatfrom solution import addandfrom z3 import .... Neither exists in the package, so this will break imports. Same issue Copilot flagged forsimple_math.py(which you removed) — this one slipped through. -
context/simple_math_example.py— Another PDD-generated artifact, unrelated to this PR's scope. -
uv.lock— The project doesn't use uv (no references in Makefile, CI, or pyproject.toml). This looks like it was committed from your local setup. 3,637 lines we don't need.
Once those are removed, this is good to merge.
Summary
Improves PDD debug snapshots (core dumps written to .pdd/core_dumps/pdd-core-*.json) so failures are easier to diagnose without log spelunking.
Test Results
Test_Core_Dump - Passed
Sync_Orchestration - Passed
Test_Core_Errors - Passed
Test_Cli - Passed
Manual testing
schema_version is 2, errors includes expected structured entries when no Python traceback exists, steps length/order matches the run; missing models show "unknown", terminal_output has no obvious ANSI/OSC garbage, sync_steps present when meta sync logs exist; entries look like the failing operations, on LLM-using failure paths, details.llm_trace appears when designed
Checklist
Fixes #710