Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 36 additions & 9 deletions .flow/specs/fn-38.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,27 +6,52 @@

## Overview

When an investigation identifies a code change as root cause, include direct links to the PR or commit in GitHub/GitLab/Bitbucket. Add PR metadata enrichment during git sync and URL template rendering per provider.
When an investigation identifies a code change as root cause, include direct links to the PR or commit in GitHub/GitLab/Bitbucket. **Leverage bond-agent's existing git tools** (`githunter` and `github` toolsets) for PR metadata lookup rather than building custom API clients.

## Architecture

```
bond-agent (external PyPI package)
├── bond.tools.githunter
│ ├── find_pr_discussion(repo_path, commit_hash) → PRDiscussion
│ ├── blame_line(repo_path, file_path, line_no) → BlameResult
│ └── get_file_experts(repo_path, file_path) → List[FileExpert]
└── bond.tools.github
├── github_get_pr(owner, repo, number) → PullRequest
├── github_get_commits(owner, repo, path) → List[Commit]
└── ... (other tools)

dataing (this repo)
├── migrations/ (schema for PR metadata)
├── adapters/git/url_templates.py (URL rendering)
└── output formatting (CLI, notebook, markdown, API)
```

## Scope

- Extend `code_changes` table with `pr_number`, `pr_url`, `pr_title`
- PR metadata lookup via GitHub/GitLab API during sync
- URL templates per provider for commit and PR links
- Links in CLI output, notebook output, markdown export, API response
**Use from bond-agent (DO NOT rebuild):**
- PR lookup from commit hash (`find_pr_discussion`)
- PR metadata fetching (`github_get_pr`)
- Commit history (`github_get_commits`)

**Build in dataing:**
- Database schema changes (`code_changes` table + PR columns)
- URL template rendering for GitHub/GitLab/Bitbucket
- Integration layer to call bond-agent tools during sync
- Output formatting for CLI/notebook/markdown/API

## Key Files

- `python-packages/dataing/migrations/` (alter code_changes table)
- `python-packages/dataing/src/dataing/models/` (update model)
- `python-packages/dataing/src/dataing/adapters/git/url_templates.py` (URL rendering)
- `python-packages/dataing/src/dataing/adapters/git/pr_enrichment.py` (bond-agent integration)
- `python-packages/dataing-cli/src/dataing_cli/display.py` (CLI output)
- `python-packages/dataing-cli/src/dataing_cli/export.py` (export output)
- `python-packages/dataing-notebook/src/dataing_notebook/rendering.py`
- `python-packages/dataing-notebook/src/dataing_notebook/rendering.py` (notebook output)

## Quick Commands

```bash
uv run pytest python-packages/dataing/tests/unit/ -v -k "git or export"
uv run pytest python-packages/dataing/tests/unit/ -v -k "git or export or pr"
```

## Acceptance
Expand All @@ -36,7 +61,9 @@ uv run pytest python-packages/dataing/tests/unit/ -v -k "git or export"
- Links formatted for web and CLI
- Support for GitHub, GitLab, Bitbucket URL formats
- Metadata includes PR number, title, author, merge date
- Uses bond-agent tools for GitHub API calls (no custom API client)

## Dependencies

- Blocked by fn-36 (Git Integration) and fn-37 (Agent Code Context)
- Requires `bond-agent>=0.1.2` (already in pyproject.toml)
21 changes: 17 additions & 4 deletions .flow/tasks/fn-38.1.json
Original file line number Diff line number Diff line change
@@ -1,14 +1,27 @@
{
"assignee": null,
"assignee": "bordumbb@gmail.com",
"claim_note": "",
"claimed_at": null,
"claimed_at": "2026-01-31T16:05:39.377826Z",
"created_at": "2026-01-28T03:52:30.310569Z",
"depends_on": [],
"epic": "fn-38",
"evidence": {
"files_created": [
"python-packages/dataing/migrations/034_code_changes_pr_metadata.sql"
],
"files_modified": [
"python-packages/dataing/src/dataing/models/code_change.py",
"python-packages/dataing/src/dataing/core/domain_types.py"
],
"verification": {
"model_import": "OK",
"mypy": "passed"
}
},
"id": "fn-38.1",
"priority": 1,
"spec_path": ".flow/tasks/fn-38.1.md",
"status": "todo",
"status": "done",
"title": "Add migration to extend code_changes with PR metadata columns",
"updated_at": "2026-01-28T03:52:30.310795Z"
"updated_at": "2026-01-31T16:07:05.572932Z"
}
48 changes: 29 additions & 19 deletions .flow/tasks/fn-38.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,25 @@

## Description

Extend the `code_changes` table (created by fn-36) with columns to store pull request metadata. When investigations link an anomaly to a code change, PR information provides richer context (title, author, review status) and enables deep links to the PR in GitHub/GitLab/Bitbucket. This migration adds the columns; fn-38.2 populates them during sync.
Extend the `code_changes` table (created by fn-36) with columns to store pull request metadata. When investigations link an anomaly to a code change, PR information provides richer context (title, author, review status) and enables deep links to the PR in GitHub/GitLab/Bitbucket. This migration adds the columns; fn-38.2 populates them using bond-agent tools.

### Implementation

1. Create a new migration file `python-packages/dataing/migrations/032_code_changes_pr_metadata.sql`:
1. Create a new migration file `python-packages/dataing/migrations/034_code_changes_pr_metadata.sql`:
```sql
-- PR metadata for code changes
-- Epic fn-38: PR/Commit Link in Investigation Output

ALTER TABLE code_changes
ADD COLUMN pr_number INTEGER,
ADD COLUMN pr_url TEXT,
ADD COLUMN pr_title TEXT,
ADD COLUMN pr_author TEXT,
ADD COLUMN pr_merged_at TIMESTAMPTZ,
ADD COLUMN provider TEXT; -- github, gitlab, bitbucket (denormalized from git_repositories)
ADD COLUMN IF NOT EXISTS pr_number INTEGER,
ADD COLUMN IF NOT EXISTS pr_url TEXT,
ADD COLUMN IF NOT EXISTS pr_title TEXT,
ADD COLUMN IF NOT EXISTS pr_author TEXT,
ADD COLUMN IF NOT EXISTS pr_merged_at TIMESTAMPTZ,
ADD COLUMN IF NOT EXISTS provider TEXT; -- github, gitlab, bitbucket

-- Index for PR lookups by commit
CREATE INDEX idx_code_changes_pr_number
CREATE INDEX IF NOT EXISTS idx_code_changes_pr_number
ON code_changes(repo_id, pr_number)
WHERE pr_number IS NOT NULL;

Expand All @@ -32,7 +32,7 @@ Extend the `code_changes` table (created by fn-36) with columns to store pull re
COMMENT ON COLUMN code_changes.provider IS 'Git provider: github, gitlab, bitbucket';
```

2. Update the SQLAlchemy model for `CodeChange` (created by fn-36) to include the new columns. The model file location depends on fn-36 but is expected at `python-packages/dataing/src/dataing/models/code_change.py`:
2. Update the SQLAlchemy model for `CodeChange` to include the new columns:
```python
pr_number: Mapped[int | None] = mapped_column(Integer, nullable=True)
pr_url: Mapped[str | None] = mapped_column(Text, nullable=True)
Expand All @@ -42,7 +42,7 @@ Extend the `code_changes` table (created by fn-36) with columns to store pull re
provider: Mapped[str | None] = mapped_column(String(20), nullable=True)
```

3. Update the `CodeChange` domain type in `python-packages/dataing/src/dataing/core/domain_types.py` (added by fn-37.1) to include PR fields:
3. Update the `CodeChange` domain type in `domain_types.py` to include PR fields:
```python
class CodeChange(BaseModel):
...
Expand All @@ -55,31 +55,41 @@ Extend the `code_changes` table (created by fn-36) with columns to store pull re
```

### Key Files
- **CREATE**: `python-packages/dataing/migrations/032_code_changes_pr_metadata.sql`
- **CREATE**: `python-packages/dataing/migrations/034_code_changes_pr_metadata.sql`
- **MODIFY**: `python-packages/dataing/src/dataing/models/code_change.py` (add PR columns)
- **MODIFY**: `python-packages/dataing/src/dataing/core/domain_types.py` (extend CodeChange)

### Verification
```bash
# Verify migration SQL is valid (syntax check)
psql -f python-packages/dataing/migrations/032_code_changes_pr_metadata.sql --dry-run

# Verify model compiles
uv run python -c "from dataing.models.code_change import CodeChange; print('OK')"
uv run mypy python-packages/dataing/src/dataing/models/code_change.py
```

## Acceptance
- [ ] Migration adds pr_number, pr_url, pr_title, pr_author, pr_merged_at, provider columns to code_changes
- [ ] All new columns are nullable (backward compatible with existing rows)
- [ ] Migration adds pr_number, pr_url, pr_title, pr_author, pr_merged_at, provider columns
- [ ] All new columns are nullable (backward compatible)
- [ ] Index created on (repo_id, pr_number) for efficient PR lookups
- [ ] SQLAlchemy model updated with matching mapped columns
- [ ] CodeChange domain type includes PR fields with None defaults
- [ ] Migration file follows existing naming convention (032_*.sql)
- [ ] Migration file follows existing naming convention

## Done summary
TBD
# fn-38.1 Done Summary

Added PR metadata columns to code_changes table for storing pull request information
enriched during git sync.

## Changes
- Created migration `034_code_changes_pr_metadata.sql` adding columns:
- pr_number, pr_url, pr_title, pr_author, pr_merged_at, provider
- Index on (repo_id, pr_number) for efficient PR lookups
- Updated SQLAlchemy CodeChange model with matching mapped columns
- Updated RelevantCodeChange domain type with PR fields

## Verification
- Model compiles successfully
- mypy passes with no errors
## Evidence
- Commits:
- Tests:
Expand Down
25 changes: 21 additions & 4 deletions .flow/tasks/fn-38.2.json
Original file line number Diff line number Diff line change
@@ -1,16 +1,33 @@
{
"assignee": null,
"assignee": "bordumbb@gmail.com",
"claim_note": "",
"claimed_at": null,
"claimed_at": "2026-01-31T16:07:16.261572Z",
"created_at": "2026-01-28T03:52:34.548035Z",
"depends_on": [
"fn-38.1"
],
"epic": "fn-38",
"evidence": {
"files_created": [
"python-packages/dataing/src/dataing/adapters/git/pr_enrichment.py",
"python-packages/dataing/tests/unit/adapters/git/test_pr_enrichment.py"
],
"files_modified": [
"python-packages/dataing/src/dataing/adapters/git/__init__.py",
"python-packages/dataing/src/dataing/adapters/db/code_changes.py"
],
"tests": {
"failed": 0,
"passed": 9
},
"verification": {
"mypy": "passed"
}
},
"id": "fn-38.2",
"priority": 2,
"spec_path": ".flow/tasks/fn-38.2.md",
"status": "todo",
"status": "done",
"title": "Implement PR metadata lookup during git sync (GitHub API)",
"updated_at": "2026-01-28T03:52:34.548226Z"
"updated_at": "2026-01-31T16:10:35.710667Z"
}
Loading