Heal Prompt Drift by vishalramvelu · Pull Request #728 · promptdriven/pdd

vishalramvelu · 2026-03-27T06:53:41Z

Summary

Improves pdd update repo-wide mode to be a reliable drift-healing workflow with explicit UX and safer controls.
Adds repo-wide flags --all, --dry-run, and --budget, plus validation so repo-only options are enforced in repo mode.
Implements dependency-ordered processing of changed modules so high-impact updates are prioritized when budget-limited.
Adds post-update maintenance for successful repo updates:
regenerate included markdown docs referenced by tags, regenerate module examples, upload refreshed dev units to PDD Cloud grounding.
Refines behavior discovered during validation:
dry-run is non-mutating during scan/pair resolution, example/upload pipeline uses current processed pair prompt/code paths as source of truth to avoid path/context mismatch issues.

Tests

Added/updated CLI tests in tests/test_update_command.py for:
--all behavior and validation, --dry-run pass-through and repo-only enforcement, --budget pass-through, positivity validation, and repo-only enforcement.
Added/updated repo-mode tests in tests/test_update_main.py for:
flat dry-run cost estimate, included-doc aggregation in drift report, budget-cap stopping behavior, dependency ordering under budget, included markdown doc regeneration after successful updates, example regeneration + _auto_submit_example submission, dry-run skipping side-effecting work, dry-run scan not creating missing prompts,
example submit path using current pair prompt/code paths.

Manual Tests

Ran uv run pdd update --all --dry-run
Verified repository drift report output (counts, cost range, drifted mappings, included docs).
Verified no LLM/prompt write/arch/PRD update path execution.
Confirmed no “Created missing prompt file” spam during dry-run scan.
Ran uv run pdd --force update --all --simple --budget 0.031
Verified real prompt update flow executed.
Verified example regeneration executed successfully.
Verified dev unit grounding submission executed successfully.
Verified budget cap message printed and run stopped early.
Observed expected slight overshoot behavior (cap checked between items, not mid-item).

Checklist

Fixes #727

Copilot

Pull request overview

This PR enhances pdd update repository-wide mode into an explicit, safer “drift healing” workflow by adding repo-wide flags, non-mutating dry-run reporting, budget-capped execution, dependency-aware ordering, and best-effort post-update maintenance (doc/example regeneration + grounding submission).

Changes:

Add --all, --dry-run, and --budget to pdd update, with repo-only enforcement and argument validation.
Implement repo-mode dry-run drift reporting, dependency-ordered processing of changed pairs, and a budget cap that stops between items.
After successful repo updates, regenerate included markdown docs, regenerate examples, and auto-submit refreshed dev units for grounding (best-effort).

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
`pdd/commands/modify.py`	Adds CLI flags (`--all`, `--dry-run`, `--budget`) and enforces repo-only options/modes.
`pdd/update_main.py`	Implements dry-run drift report, dependency ordering, budget cap, and post-update doc/example/grounding maintenance.
`tests/test_update_command.py`	Adds CLI tests for `--all`, `--dry-run`, and `--budget` pass-through + validation.
`tests/test_update_main.py`	Adds repo-mode tests for dry-run non-mutation, cost estimation, dependency ordering, budget stopping, doc regen, example regen, and auto-submit.
`tests/test_sync_code_main.py`	Updates an assertion to match the new `find_and_resolve_all_pairs(..., create_missing_prompts=...)` call signature.
`pdd/prompts/commands/pdd/commands/modify_python.prompt`	Updates the prompt-spec contract to match new CLI behavior and constraints.
`pdd/prompts/agentic_e2e_fix_orchestrator_python.prompt`	Adds convergence-guard requirements to the orchestrator prompt.
`project_dependencies.csv`	Adds a CSV snapshot of dependency metadata (appears to be auto-deps output).
`pdd/core_dump_smoke.py`	Adds a standalone “add()” test plan/test module (appears unrelated to drift-healing).
`pdd/prompts/core_dump_smoke_python.prompt`	Adds a generic “write add(a, b)” smoke prompt (appears unrelated).
`pdd/prompts/core_dump_requirements_LLM.prompt`	Adds a large core-dump requirements spec prompt (appears unrelated).
`context/simple_math_example.py`	Adds a standalone pytest-running example snippet (appears unrelated).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

project_dependencies.csv

pdd/update_main.py

pdd/core_dump_smoke.py

pdd/prompts/core_dump_smoke_python.prompt

pdd/prompts/core_dump_requirements_LLM.prompt

context/simple_math_example.py

gltanaka

Great work on this PR! The core drift-healing feature is really well thought out — the --dry-run, --budget, --all, dependency ordering, and post-update doc/example regeneration are all solid additions that fill real gaps in the pdd update workflow. The test coverage for the new logic is thorough too. Nice job.

A few things to address before we merge:

Prompts location

modify_python.prompt and core_dump_smoke_python.prompt should be submitted to pdd_cap per .sync-config.yml and docs/ONBOARDING.md. (Also heads up — the path pdd/prompts/commands/pdd/commands/modify_python.prompt has a double-nesting that looks unintentional.)

`project_dependencies.csv` — needs a full scan

Love that this is being committed as a cache — but right now it only has 11 root-level config files (.env.example through Makefile). It looks like the pdd auto-deps run was interrupted after scanning just the top of the alphabetical git ls-files output. No Python source modules made it in. Could you re-run it scoped to at least the pdd/ source tree (~161 Python files) and commit the complete result?

`context/simple_math_example.py`

This looks like it was accidentally included — mind removing it?

Question: `pdd/core_dump_smoke.py`

Is this intended to be part of this PR? It's a z3 formal verification suite for an add() function that imports from a solution module — seems like it might belong with a separate feature. Totally fine if there's a reason it's here, just want to make sure.

Small cleanup

A couple things the linter would catch:

_read_text_byte_len() in update_main.py is defined but never called — remove or wire it in?
Counter is imported but unused
Minor: .mdx files are detected in the drift report but excluded from doc regeneration (.md only) — is that intentional?

Validation

The public CI unit tests pass, which is great. Two things that would increase confidence before merge:

Could we trigger the private repo full test suite (Test Public PR workflow)? It hasn't been run for this PR yet, and it would catch any integration issues with the real prompts/symlink setup.
The post-update side effects (doc regen, example regen, Cloud grounding upload) are well-tested via mocks, but haven't been verified end-to-end. The manual tests in the description cover --dry-run and --budget nicely — could you also confirm the doc/example/upload pipeline works against a real repo?

Also — pdd/prompts/core_dump_requirements_LLM.prompt looks useful, just want to confirm it's intentionally scoped to this PR.

Really appreciate the effort here — the feature design, the manual testing writeup, and the comprehensive checklist in the description all make this easy to review. Looking forward to getting this merged! 🙏

vishalramvelu · 2026-03-29T05:00:18Z

Great work on this PR! The core drift-healing feature is really well thought out — the --dry-run, --budget, --all, dependency ordering, and post-update doc/example regeneration are all solid additions that fill real gaps in the pdd update workflow. The test coverage for the new logic is thorough too. Nice job.

A few things to address before we merge:

Prompts location

modify_python.prompt and core_dump_smoke_python.prompt should be submitted to pdd_cap per .sync-config.yml and docs/ONBOARDING.md. (Also heads up — the path pdd/prompts/commands/pdd/commands/modify_python.prompt has a double-nesting that looks unintentional.)

project_dependencies.csv — needs a full scan

Love that this is being committed as a cache — but right now it only has 11 root-level config files (.env.example through Makefile). It looks like the pdd auto-deps run was interrupted after scanning just the top of the alphabetical git ls-files output. No Python source modules made it in. Could you re-run it scoped to at least the pdd/ source tree (~161 Python files) and commit the complete result?

context/simple_math_example.py

This looks like it was accidentally included — mind removing it?

Question: pdd/core_dump_smoke.py

Is this intended to be part of this PR? It's a z3 formal verification suite for an add() function that imports from a solution module — seems like it might belong with a separate feature. Totally fine if there's a reason it's here, just want to make sure.

Small cleanup

A couple things the linter would catch:

_read_text_byte_len() in update_main.py is defined but never called — remove or wire it in?

Counter is imported but unused

Minor: .mdx files are detected in the drift report but excluded from doc regeneration (.md only) — is that intentional?

Validation

The public CI unit tests pass, which is great. Two things that would increase confidence before merge:

Could we trigger the private repo full test suite (Test Public PR workflow)? It hasn't been run for this PR yet, and it would catch any integration issues with the real prompts/symlink setup.

The post-update side effects (doc regen, example regen, Cloud grounding upload) are well-tested via mocks, but haven't been verified end-to-end. The manual tests in the description cover --dry-run and --budget nicely — could you also confirm the doc/example/upload pipeline works against a real repo?

Also — pdd/prompts/core_dump_requirements_LLM.prompt looks useful, just want to confirm it's intentionally scoped to this PR.

Really appreciate the effort here — the feature design, the manual testing writeup, and the comprehensive checklist in the description all make this easy to review. Looking forward to getting this merged! 🙏

Thanks for the review. I removed modify_python.prompt and core_dump_smoke_python.prompt. Also project_dependencies.csv was regenerated with pdd auto-deps --force-scan scoped to pdd/ (~161 Python files). context/simple_math_example.py, core_dump_requirements_LLM.prompt, and the core_dump_smoke* artifacts are removed from this PR. update_main.py cleanup: dropped unused _read_text_byte_len and Counter. .mdx is omitted from the drift “will regen” list on purpose since post-update regen is .md only. I’ll trigger the private Test Public PR workflow, and I’ve manually verified example + included-doc regen + cloud upload on a real repo.

gltanaka

Thanks for the follow-up commit — most of the review items are addressed and the core feature is solid. Two things I'd like fixed before we merge:

Budget cap doesn't account for post-update LLM costs

The --budget check runs at the top of the loop, but example regeneration and doc regeneration call context_generator_main (which makes LLM calls) after the budget check. A user passing --budget 5.00 could end up spending well past that with no warning. Could you either:

Check the budget again after the post-update steps, or
Skip example regen / doc regen when the budget is already close to exhausted

Silent `except Exception: pass` blocks

The post-update pipeline has 5 bare except Exception: pass blocks. If example regen, cloud upload, or doc regen fails, there's no trace — no log line, no warning. This makes debugging really hard, especially combined with the budget issue above (money spent with no record of what happened). Could you swap the bare pass for logging.warning(...) or console.print(...) so failures are at least visible?

Everything else looks good — the dependency ordering, dry-run report, and test coverage are all well done. Looking forward to getting this in!

…ures

vishalramvelu · 2026-03-29T20:44:11Z

Thanks for the follow-up commit — most of the review items are addressed and the core feature is solid. Two things I'd like fixed before we merge:

Budget cap doesn't account for post-update LLM costs

The --budget check runs at the top of the loop, but example regeneration and doc regeneration call context_generator_main (which makes LLM calls) after the budget check. A user passing --budget 5.00 could end up spending well past that with no warning. Could you either:

Check the budget again after the post-update steps, or

Skip example regen / doc regen when the budget is already close to exhausted

Silent except Exception: pass blocks

The post-update pipeline has 5 bare except Exception: pass blocks. If example regen, cloud upload, or doc regen fails, there's no trace — no log line, no warning. This makes debugging really hard, especially combined with the budget issue above (money spent with no record of what happened). Could you swap the bare pass for logging.warning(...) or console.print(...) so failures are at least visible?

Everything else looks good — the dependency ordering, dry-run report, and test coverage are all well done. Looking forward to getting this in!

Thanks for the follow-up. I’ve addressed both items: --budget now applies to post-update work—before example regeneration, cloud submit, and each included-doc regeneration we check cumulative spend against the cap and skip those steps when the budget is already reached (with a short user-visible message when not in quiet mode), and we warn if cumulative cost ends up above the cap after post-update LLM steps. I also replaced the bare except: pass blocks in that pipeline with logging.warning(..., exc_info=True) plus print warnings when not quiet, so example/regen/submit failures are visible in logs and the console. Tests in tests/test_update_main.py were updated for the new budget behavior. Let me know if any other changes needed.

gltanaka · 2026-03-30T17:22:48Z

target 3/31

Made-with: Cursor # Conflicts: # pdd/prompts/agentic_e2e_fix_orchestrator_python.prompt # pdd/update_main.py # tests/test_update_main.py

… repo-mode tests

gltanaka

Re-review (Round 3)

CI is green, R1/R2 feedback on budget enforcement and error logging in the post-update pipeline (example regen, cloud submit, doc regen) is addressed — those except Exception: pass blocks now properly log warnings, and budget is checked before each post-update LLM call. Nice work.

Two blocking issues and a few smaller items before we can merge:

Blocking

1. Key collision in `_order_changed_items_by_dependency` silently drops items

update_main.py around line 897:

key_to_item[key] = (idx, item)

If two changed items produce the same _dependency_order_key (same basename::language), the dict overwrite silently discards the first item from the output. This is a data-loss bug — a changed file disappears from the update run with no warning.

Fix: Either detect the collision and bail out (return the original list, like the key is None case), or accumulate into a list-valued dict.

2. PRD sync LLM call not gated by `--budget`

The PRD sync block (around line 1465) calls run_agentic_task() after the main update loop with no budget check. A user passing --budget 5.00 could hit the cap during file updates, then still incur an unbounded LLM call for PRD sync. Since this PR is adding --budget as a contract with the user, all LLM calls in the repo-wide path should respect it.

Fix: Add a _budget_allows_post_update_llm(budget, total_repo_cost) guard before the run_agentic_task call in the PRD sync block.

Non-blocking (address if easy, otherwise fine for follow-up)

3. `_estimate_dry_run_cost_range` has 3 unused parameters

ctx, repo_obj, and simple are declared but never used. Remove or prefix with _.

4. `_dependency_order_key` swallows all exceptions silently

except Exception:
    return None

At minimum logger.debug(...) so failures can be diagnosed. Returning None causes the entire ordering to bail out to the original list, which is safe but invisible.

5. `core_dump_requirements_LLM.prompt` — scope question (carried over from R1)

This 110-line prompt spec for debug snapshots is unrelated to drift-healing. Still in the PR. Could you either remove it (separate PR) or confirm it's intentionally scoped here?

6. `uv.lock` — 3,637 new lines

The project's CLAUDE.md references pip install and requirements.txt. Is the project migrating to uv, or was this accidentally committed? If uv is intended, that's fine — just want to make sure we're not introducing a parallel lock file that drifts.

Test coverage notes (non-blocking, good areas for follow-up)

No test for dependency ordering with cycles (the production code has a fallback at line ~954 but it's untested)
No test for post-update error logging — the logger.warning() calls added per R2 feedback are correct but have no test asserting they fire on exception
No test for _budget_allows_post_update_llm in isolation (cheap to add, catches off-by-one)
No test for updated_doc_paths deduplication (two prompts referencing the same .md doc)
Budget cost assertions (e.g. result[1] == pytest.approx(1.3)) are fragile — they encode internal cost-accumulation ordering. Consider asserting behavioral invariants instead (e.g. "total cost > budget", "fewer than all items processed")

What looks good

Budget enforcement in the per-file post-update pipeline is thorough — re-checked before example regen, cloud submit, and each included-doc regen
Kahn's algorithm for dependency ordering is correct (modulo the collision issue above)
Dry-run is properly non-mutating: create_missing_prompts=not dry_run, no LLM calls, no prompt writes
_included_docs_for_drift_report filtering logic is correct
The merge from main (4/3) resolved cleanly — no artifacts
project_dependencies.csv now has 258 entries — looks like a complete scan

Once #1 and #2 are addressed, I'm happy to approve. Everything else can be follow-ups.

vishalramvelu added 3 commits March 25, 2026 22:11

Update modify flow and add core dump pro

ce1b411

Improve repo-wide pdd update for reliable drift healing

5af7186

Fix CI regressions for update_main side effects and prompt requirements

93bd051

gltanaka requested a review from Copilot March 28, 2026 00:24

Copilot started reviewing on behalf of gltanaka March 28, 2026 00:24 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

gltanaka requested changes Mar 28, 2026

View reviewed changes

gltanaka mentioned this pull request Mar 28, 2026

fix: Reconcile dual dependency graphs and fix silent dependency loss in partial syncs #733

Open

6 tasks

feat(update): repo-wide drift healing, cache, and review follow-ups

5dd5868

gltanaka requested changes Mar 29, 2026

View reviewed changes

fix(update): honor --budget for post-update LLM; log post-update fail…

2f26ebc

…ures

vishalramvelu added 2 commits April 3, 2026 17:20

Merge origin/main into feature/701_sync

feccb93

Made-with: Cursor # Conflicts: # pdd/prompts/agentic_e2e_fix_orchestrator_python.prompt # pdd/update_main.py # tests/test_update_main.py

fix(update): default scan skips creating prompts; mock example gen in…

72e137a

… repo-mode tests

gltanaka requested changes Apr 4, 2026

View reviewed changes

fix(update): R3 changes

6f4f0c7

Conversation

vishalramvelu commented Mar 27, 2026

Summary

Tests

Manual Tests

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gltanaka left a comment

Choose a reason for hiding this comment

Prompts location

project_dependencies.csv — needs a full scan

context/simple_math_example.py

Question: pdd/core_dump_smoke.py

Small cleanup

Validation

Uh oh!

vishalramvelu commented Mar 29, 2026

Prompts location

project_dependencies.csv — needs a full scan

context/simple_math_example.py

Question: pdd/core_dump_smoke.py

Small cleanup

Validation

Uh oh!

gltanaka left a comment

Choose a reason for hiding this comment

Budget cap doesn't account for post-update LLM costs

Silent except Exception: pass blocks

Uh oh!

vishalramvelu commented Mar 29, 2026

Budget cap doesn't account for post-update LLM costs

Silent except Exception: pass blocks

Uh oh!

gltanaka commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gltanaka left a comment

Choose a reason for hiding this comment

Re-review (Round 3)

Blocking

1. Key collision in _order_changed_items_by_dependency silently drops items

2. PRD sync LLM call not gated by --budget

Non-blocking (address if easy, otherwise fine for follow-up)

3. _estimate_dry_run_cost_range has 3 unused parameters

4. _dependency_order_key swallows all exceptions silently

5. core_dump_requirements_LLM.prompt — scope question (carried over from R1)

6. uv.lock — 3,637 new lines

Test coverage notes (non-blocking, good areas for follow-up)

What looks good

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`project_dependencies.csv` — needs a full scan

`context/simple_math_example.py`

Question: `pdd/core_dump_smoke.py`

`project_dependencies.csv` — needs a full scan

`context/simple_math_example.py`

Question: `pdd/core_dump_smoke.py`

Silent `except Exception: pass` blocks

Silent `except Exception: pass` blocks

gltanaka commented Mar 30, 2026 •

edited

Loading

1. Key collision in `_order_changed_items_by_dependency` silently drops items

2. PRD sync LLM call not gated by `--budget`

3. `_estimate_dry_run_cost_range` has 3 unused parameters

4. `_dependency_order_key` swallows all exceptions silently

5. `core_dump_requirements_LLM.prompt` — scope question (carried over from R1)

6. `uv.lock` — 3,637 new lines