Skip to content

feat(sync): pre-compile dbt docs for repo sources to improve centralized docs context#213

Open
Flamki wants to merge 5 commits intogetnao:mainfrom
Flamki:feat/settings-version-number
Open

feat(sync): pre-compile dbt docs for repo sources to improve centralized docs context#213
Flamki wants to merge 5 commits intogetnao:mainfrom
Flamki:feat/settings-version-number

Conversation

@Flamki
Copy link
Contributor

@Flamki Flamki commented Feb 16, 2026

Summary

Adds optional dbt docs pre-compilation during nao sync for repository sources, so the agent can use compiled dbt artifacts (catalog.json, manifest.json) when projects use centralized docs references in YAML.

Closes #190.

Changes

• Added repo config options:
• compile_dbt_docs: bool (default false)
• dbt_profiles_dir: Optional[str]

• Updated repository sync flow:
• after clone/pull, when enabled, run:
dbt docs generate --project-dir [--profiles-dir ...]

• Added best-effort behavior:
• if repo is not dbt, dbt is missing, or generation fails, sync continues with warning logs

• Added unit tests for:
• skip when disabled
• skip when non-dbt repo
• skip when dbt binary missing
• execute command path with profiles dir
• integration into repo sync success flow

• Updated docs:
• README.md (usage + config example + behavior notes)
• README.md (“Recent Updates” note)
• nao_config.yaml enabling this for example dbt repo

Why

Raw dbt YAML with centralized doc() references can be hard for the agent to interpret directly. Compiled docs artifacts provide richer, resolved metadata.

Backward compatibility

• No behavior change unless compile_dbt_docs: true is explicitly set.
• Existing configs continue to work unchanged.

Validation

• test_repository_provider.py
• Result: 17 passed

@Bl3f
Copy link
Contributor

Bl3f commented Feb 16, 2026

Mentioned this on #209 because both are related and we should find the best combination of both to make it understandable for end users.

@Flamki Flamki force-pushed the feat/settings-version-number branch from 6dc041f to 8994347 Compare February 17, 2026 10:24
@Flamki
Copy link
Contributor Author

Flamki commented Feb 17, 2026

Thanks for the feedback and for linking #209.

I cleaned up #213 and force-pushed a focused branch:

  • removed unrelated history
  • kept only the dbt compiled-docs sync change

Scope of #213 remains intentionally minimal:

  • optional compile_dbt_docs repo setting
  • runs dbt docs generate during repo sync
  • exposes compiled artifacts (manifest.json, catalog.json) for centralized docs use cases
  • best-effort behavior (non-blocking if dbt is missing/fails)

I see #213 as complementary to #209, not competing:

Happy to align on whether to keep both paths, pick one as default, or gate both behind config flags.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 13 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="cli/nao_core/commands/sync/providers/repositories/provider.py">

<violation number="1" location="cli/nao_core/commands/sync/providers/repositories/provider.py:114">
P1: Missing exception handling contradicts the "best-effort" contract. Unlike `clone_or_pull_repo` which wraps its work in `try/except`, this function lets exceptions from `subprocess.run` (e.g., `OSError`, `PermissionError`) propagate up, which would crash the sync loop and skip all remaining repositories. Wrap the body in a try/except to match the sibling function's pattern.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@Bl3f
Copy link
Contributor

Bl3f commented Feb 17, 2026

Be careful you added the Trino code to this PR.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="cli/nao_core/commands/sync/providers/repositories/provider.py">

<violation number="1" location="cli/nao_core/commands/sync/providers/repositories/provider.py:115">
P1: `subprocess.run` for `dbt docs generate` has no `timeout`, which can hang the sync process indefinitely if the database is unreachable or slow. Consider adding a reasonable timeout (e.g., 300 seconds) to keep this best-effort and non-blocking.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@Bl3f
Copy link
Contributor

Bl3f commented Feb 18, 2026

Hey @Flamki you added the Trino code from your other PR to this one, could you remove it?

Copy link
Contributor

@Bl3f Bl3f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the dbt, its LGTM but remove the Trino code if you can pls.

from .postgres import PostgresConfig
from .redshift import RedshiftConfig
from .snowflake import SnowflakeConfig
from .trino import TrinoConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be removed from this PR.

dbt_project_file = repo_path / "dbt_project.yml"

if not dbt_project_file.exists():
console.print(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the ui.py methods to display stuff in the console.

cmd.extend(["--profiles-dir", repo.dbt_profiles_dir])

console.print(f" [dim]Pre-compiling dbt docs for[/dim] {repo.name}")
result = subprocess.run(cmd, capture_output=True, text=True, check=False, timeout=300)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing I'm the most worried about is that a user will have to forward all the necessary dbt env variables to compile correctly the dbt docs (esp. if we want it in the "production" context). It adds a small complexity, but the feature is interesting, so we can add it, but might be complex for users to make it work.

## Recent Updates

- `nao sync` now supports optional dbt docs pre-compilation for repository sources (`compile_dbt_docs: true`), which
generates compiled artifacts such as `target/catalog.json` and `target/manifest.json` for richer agent context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing I'm afraid about is the size of the manifest.json for large project that might make it unusable and pollute the context window. Nonetheless, as it generate also the compile version of each query in the folder structure that's interesting.

@Flamki Flamki force-pushed the feat/settings-version-number branch from 611c65b to 2e894a6 Compare February 18, 2026 11:27
@Flamki
Copy link
Contributor Author

Flamki commented Feb 18, 2026

Thanks for the review — addressed.

Updates made in this PR:

  • Removed the unrelated Trino changes from this branch (scope is now dbt pre-compile only).
  • Switched repository provider output to use ui.py helpers in cli/nao_core/commands/sync/providers/repositories/provider.py.
  • Kept dbt docs generation best-effort and non-blocking.
  • Added timeout for dbt docs generate to avoid hangs.
  • Ran formatting fix required by Ruff.

Current status:

  • Diff now contains only 6 files related to this feature.
  • No conflicts with base branch.
  • Checks are passing:
    • CLI Lint / Ruff Checks
    • cubic · AI code reviewer

Validation:

  • PYTHONPATH=cli python -m pytest cli/tests/nao_core/commands/sync/test_repository_provider.py -q → 17 passed

Happy to adjust any wording in docs around env var complexity / manifest size if you want before merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pre-compile dbt docs when centralized docs

2 participants