Skip to content

fix: Reconcile dual dependency graphs and fix silent dependency loss in partial syncs #733

@gltanaka

Description

@gltanaka

Parent Issue

This is a Phase 5 sub-ticket of the master global sync initiative: Bidirectional artifact sync across the PDD hierarchy (tracked internally). The master ticket defines 5 phases:

  1. Phase 1 — Heal Prompt Drift (pdd update --all) — addressed by PR Heal Prompt Drift #728
  2. Phase 2 — CI drift detection on PRs
  3. Phase 3 — Auto-heal before PDD operations (pdd change detects stale prompts)
  4. Phase 4 — Simple batch downward sync (pdd sync no-args)
  5. Phase 5 — Foundation fixesthis ticket

Problem

PDD has two parallel dependency graph sources that can disagree, plus a silent data loss bug in partial syncs.

Source 1: <include> tags in prompts → sync_order.py

build_dependency_graph() (line 110) scans <include> tags in prompt files and maps them to module names via extract_module_from_include(). This captures what the LLM actually sees during generation.

Source 2: dependencies field in architecture.json → agentic_sync_runner.py

build_dep_graph_from_architecture() (line 90) reads the explicit dependencies array from each architecture.json entry. When architecture.json exists, this is used instead of Source 1.

The problems

1. The two graphs disagree. Research found that most modules have dependencies in architecture.json but lack corresponding <pdd-dependency> tags in their prompts. Meanwhile, <include> tags reference modules not listed in architecture.json dependencies. There are also 11 orphaned dependency references in architecture.json that point to prompt files that don't exist (including critical ones like llm_invoke_python.prompt referenced by 10+ modules).

2. Silent dependency loss in partial syncs. build_dep_graph_from_architecture() at line 141 silently drops any dependency not in the target module set:

if dep_basename and dep_basename in target_set and dep_basename != basename:
    deps.append(stripped_to_target[dep_basename])

If you sync modules A and B but A depends on C (not in the sync set), the dependency is silently lost. A syncs without waiting for C, potentially generating against a stale interface.

3. auto-deps creates divergence. Running pdd auto-deps adds <include> tags to prompts but does NOT update <pdd-dependency> tags. If architecture_sync.py then runs, it reads prompts (which lack <pdd-dependency> tags) and clears the architecture.json dependencies. The <include>-based dependencies are invisible to the architecture layer.

4. Adding architecture.json changes sync order. A project working fine without architecture.json (using build_dependency_graph() from prompts) can have its sync order silently change when architecture.json is added, because build_dep_graph_from_architecture() produces a different graph.

Root Cause

The prompting guide (line 401) says <pdd-dependency> declares architectural dependencies and <include> injects content for LLM context — different purposes. But in practice:

  • <include> tags define the actual code-level dependencies (what interfaces the LLM sees)
  • <pdd-dependency> tags are sparse and inconsistently maintained
  • architecture.json dependencies were populated during initial generation and have drifted

Proposed Solution

Make architecture.json authoritative, but validate against includes

Why architecture.json: It's the explicit, declarative source. <include> tags mix actual dependencies with context docs, preambles, and other non-dependency includes. architecture.json is cleaner.

But validate: Add a validation step that warns when architecture.json dependencies don't match the modules referenced via <include> tags. This catches drift without making <include> tags authoritative.

Specific fixes

1. Fix silent dependency loss in build_dep_graph_from_architecture():

# Instead of silently dropping:
if dep_basename and dep_basename not in target_set:
    warnings.append(f"{basename} depends on {dep_basename} (not in sync set)")

2. Add orphan detection: At architecture.json load time, validate that all dependencies entries reference modules that exist in the architecture. Warn on orphans.

3. Add cross-validation: New function that compares architecture.json dependencies against <include> tags in prompts:

4. Make auto-deps update architecture.json: When auto-deps adds an <include> for a module's example file, also add the corresponding entry to architecture.json dependencies (if architecture.json exists).

5. Deprecate <pdd-dependency> as operational: Keep for documentation, but don't use for dependency graph construction. architecture_sync.py should stop clearing architecture.json dependencies based on missing <pdd-dependency> tags.

Relationship to Other Issues

Issue Relationship
Global sync (pdd sync architecture.json) Uses build_dep_graph_from_architecture() — affected by silent loss bug
#727 (heal prompt drift) Needs correct dependency ordering for update propagation
Agentic Auto Deps Currently creates divergence — should be fixed to update architecture.json
Master global sync (see Parent Issue section above) Parent tracking issue

Acceptance Criteria

  • build_dep_graph_from_architecture() warns instead of silently dropping out-of-set dependencies
  • Orphaned dependencies in architecture.json detected and reported at load time
  • Cross-validation between architecture.json deps and prompt <include> tags
  • auto-deps updates architecture.json dependencies when adding module includes
  • architecture_sync.py does not clear architecture.json dependencies when <pdd-dependency> tags are absent
  • Adding/removing architecture.json doesn't silently change sync order for existing modules

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions