Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 13 additions & 4 deletions .flow/tasks/fn-51.1.json
Original file line number Diff line number Diff line change
@@ -1,14 +1,23 @@
{
"assignee": null,
"assignee": "bordumbb@gmail.com",
"claim_note": "",
"claimed_at": null,
"claimed_at": "2026-01-31T09:41:22.997682Z",
"created_at": "2026-01-28T22:13:48.993612Z",
"depends_on": [],
"epic": "fn-51",
"evidence": {
"commits": [
"bb87ad80cc03067f909ddb0f42935e32c78789af"
],
"prs": [],
"tests": [
"python-packages/dataing/tests/unit/core/test_codify.py"
]
},
"id": "fn-51.1",
"priority": null,
"spec_path": ".flow/tasks/fn-51.1.md",
"status": "todo",
"status": "done",
"title": "Design Test Template Schema (DataQualityTest model)",
"updated_at": "2026-01-28T22:23:22.798782Z"
"updated_at": "2026-01-31T09:46:04.031622Z"
}
14 changes: 10 additions & 4 deletions .flow/tasks/fn-51.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,15 @@ class AssertionType(str, Enum):
- [ ] Includes `failure_description` from original incident
- [ ] Unit tests pass
## Done summary
TBD

- Created DataQualityTest Pydantic model in codify.py as abstract test representation
- Added AssertionType enum with 8 types: not_null, unique, accepted_values, in_range, row_count_change, freshness, referential_integrity, custom_sql
- Added ThresholdType enum and AssertionThreshold model for configuring thresholds (exact, percentage, range, count)
- Model captures: test_id, name, description, assertion_type, table, column, parameters, threshold, sql_expression, severity, source_investigation_id, failure_description, tags, metadata
- All models are frozen (immutable) for thread safety
- Added factory methods for each assertion type for convenient creation
- Unit tests: 29 tests passing
- Mypy check passes with no issues
## Evidence
- Commits:
- Tests:
- Commits: bb87ad80cc03067f909ddb0f42935e32c78789af
- Tests: python-packages/dataing/tests/unit/core/test_codify.py
- PRs:
17 changes: 13 additions & 4 deletions .flow/tasks/fn-51.2.json
Original file line number Diff line number Diff line change
@@ -1,16 +1,25 @@
{
"assignee": null,
"assignee": "bordumbb@gmail.com",
"claim_note": "",
"claimed_at": null,
"claimed_at": "2026-01-31T09:46:47.010770Z",
"created_at": "2026-01-28T22:13:49.185474Z",
"depends_on": [
"fn-51.1"
],
"epic": "fn-51",
"evidence": {
"commits": [
"1a48a9df8b0f08052e291cdf152b136087080dfb"
],
"prs": [],
"tests": [
"python-packages/dataing/tests/unit/core/test_codify.py"
]
},
"id": "fn-51.2",
"priority": null,
"spec_path": ".flow/tasks/fn-51.2.md",
"status": "todo",
"status": "done",
"title": "Build Test Extraction from Synthesis",
"updated_at": "2026-01-28T22:23:23.220688Z"
"updated_at": "2026-01-31T09:51:42.918227Z"
}
22 changes: 18 additions & 4 deletions .flow/tasks/fn-51.2.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,23 @@ def extract_tests(synthesis: SynthesisResponse) -> list[DataQualityTest]:
- [ ] Confidence threshold: only generate test if synthesis confidence > 0.6
- [ ] Unit tests pass
## Done summary
TBD

- Added extract_tests_from_synthesis() function to codify.py
- Rule-based extraction maps root cause patterns to test types:
- NULL patterns → NOT_NULL test
- Unexpected values → ACCEPTED_VALUES test
- Row count issues → ROW_COUNT_CHANGE test (10% default)
- Freshness issues → FRESHNESS test (24h default)
- Duplicates → UNIQUE test
- Referential issues → REFERENTIAL_INTEGRITY test
- Unmatched → CUSTOM_SQL fallback
- Confidence threshold: only generates tests when synthesis.confidence >= 0.6
- Column extraction from root cause text via regex patterns
- Accepted values extraction from supporting_evidence
- Reference table/column extraction for FK tests
- All tests tagged with "auto-generated" for traceability
- Unit tests: 16 new tests for extraction logic and helpers
- All 45 tests passing, mypy clean
## Evidence
- Commits:
- Tests:
- Commits: 1a48a9df8b0f08052e291cdf152b136087080dfb
- Tests: python-packages/dataing/tests/unit/core/test_codify.py
- PRs:
15 changes: 11 additions & 4 deletions .flow/tasks/fn-51.3.json
Original file line number Diff line number Diff line change
@@ -1,16 +1,23 @@
{
"assignee": null,
"assignee": "bordumbb@gmail.com",
"claim_note": "",
"claimed_at": null,
"claimed_at": "2026-01-31T09:53:10.400067Z",
"created_at": "2026-01-28T22:13:49.409868Z",
"depends_on": [
"fn-51.1"
],
"epic": "fn-51",
"evidence": {
"commits": [],
"prs": [],
"tests": [
"python-packages/dataing/tests/unit/renderers/test_renderers.py"
]
},
"id": "fn-51.3",
"priority": null,
"spec_path": ".flow/tasks/fn-51.3.md",
"status": "todo",
"status": "done",
"title": "Implement Test Renderers (GX, dbt, Soda, SQL)",
"updated_at": "2026-01-28T22:23:23.739357Z"
"updated_at": "2026-01-31T09:59:29.592596Z"
}
25 changes: 22 additions & 3 deletions .flow/tasks/fn-51.3.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,28 @@ SELECT COUNT(*) as null_count FROM analytics.orders WHERE customer_id IS NULL;
- [ ] Renderer selection via `--format` flag
- [ ] Unit tests pass for all renderers
## Done summary
TBD

- Created renderers package at python-packages/dataing/src/dataing/renderers/
- Implemented BaseRenderer abstract class with format property and render methods
- Implemented GXRenderer for Great Expectations:
- Outputs JSON expectation suite format
- Supports render_python() for Python code output
- Maps all 8 assertion types to GX expectation types
- Implemented DbtRenderer for dbt schema.yml:
- Outputs YAML compatible with dbt's schema.yml format
- Groups tests by table/column in render_many()
- Uses dbt_expectations package for advanced tests
- Implemented SodaRenderer for SodaCL checks:
- Outputs SodaCL-compatible YAML
- Groups checks by table in render_many()
- Includes investigation ID in check names
- Implemented SQLRenderer for raw SQL assertions:
- Outputs assertion queries that return 0 rows on pass
- Includes provenance comments with investigation ID
- Added get_renderer() factory function for format selection
- Added RenderFormat enum (gx, dbt, soda, sql)
- Unit tests: 42 new tests for all renderers
- All 87 tests passing (45 codify + 42 renderers), mypy clean
## Evidence
- Commits:
- Tests:
- Tests: python-packages/dataing/tests/unit/renderers/test_renderers.py
- PRs:
41 changes: 37 additions & 4 deletions .flow/tasks/fn-51.4.json
Original file line number Diff line number Diff line change
@@ -1,17 +1,50 @@
{
"assignee": null,
"assignee": "bordumbb@gmail.com",
"claim_note": "",
"claimed_at": null,
"claimed_at": "2026-01-31T15:24:04.756573Z",
"created_at": "2026-01-28T22:13:49.605276Z",
"depends_on": [
"fn-51.2",
"fn-51.3"
],
"epic": "fn-51",
"evidence": {
"checks": {
"eslint": "passed",
"mypy": "passed",
"ruff": "passed",
"tsc": "passed"
},
"features": {
"backend": [
"POST /investigations/{id}/codify endpoint",
"Confidence validation (>= 60%)",
"Test extraction from synthesis",
"Multi-format rendering"
],
"frontend": [
"CodifyWidget with modal dialog",
"Format selector (SQL, dbt, GX, Soda)",
"Syntax-highlighted code display",
"Copy to clipboard button",
"Download as file button",
"Test summary badges"
]
},
"files_created": [
"frontend/app/src/features/investigation/components/codify-widget.tsx"
],
"files_modified": [
"python-packages/dataing/src/dataing/entrypoints/api/routes/investigations.py",
"frontend/app/src/features/investigation/components/index.ts",
"frontend/app/src/lib/api/investigations.ts",
"frontend/app/src/features/investigation/InvestigationDetail.tsx"
]
},
"id": "fn-51.4",
"priority": null,
"spec_path": ".flow/tasks/fn-51.4.md",
"status": "todo",
"status": "done",
"title": "Build Codify Test Widget Button",
"updated_at": "2026-01-28T22:23:24.111622Z"
"updated_at": "2026-01-31T15:28:48.965639Z"
}
14 changes: 12 additions & 2 deletions .flow/tasks/fn-51.4.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,18 @@ Add "Codify Test" button to investigation results that generates and displays th
- [ ] Show all extracted tests if multiple
- [ ] Frontend tests pass
## Done summary
TBD

- Added CodifyWidget component to frontend (CodifyModal + button)
- Format selector with 4 options: SQL, dbt, Great Expectations, Soda
- Syntax-highlighted code display using react-syntax-highlighter
- Copy to clipboard and download functionality
- Shows test summary badges after generation
- Added codify API endpoint (POST /investigations/{id}/codify)
- Endpoint validates confidence >= 60% before generating tests
- Integrated with Investigation renderers and codify extraction
- Added useCodifyInvestigation hook to frontend API client
- Button only shows for completed investigations with high confidence
- Frontend linting/type checking pass
- Backend ruff/mypy pass
## Evidence
- Commits:
- Tests:
Expand Down
36 changes: 32 additions & 4 deletions .flow/tasks/fn-51.5.json
Original file line number Diff line number Diff line change
@@ -1,16 +1,44 @@
{
"assignee": null,
"assignee": "bordumbb@gmail.com",
"claim_note": "",
"claimed_at": null,
"claimed_at": "2026-01-31T10:00:17.472054Z",
"created_at": "2026-01-28T22:13:49.789809Z",
"depends_on": [
"fn-51.3"
],
"epic": "fn-51",
"evidence": {
"features": [
"dataing codify <investigation_id> command",
"--format option (gx, dbt, soda, sql)",
"--output option for file output",
"--append option for merging with existing files",
"Intelligent merging for dbt schema.yml and GX suites",
"Error handling for missing synthesis or low confidence"
],
"files_created": [
"python-packages/dataing-cli/src/dataing_cli/commands/codify.py",
"python-packages/dataing-cli/tests/test_commands_codify.py"
],
"files_modified": [
"python-packages/dataing-cli/src/dataing_cli/main.py",
"python-packages/dataing-cli/pyproject.toml"
],
"precommit": {
"mypy": "passed",
"ruff": "passed"
},
"tests": {
"command": "uv run pytest python-packages/dataing-cli/tests/test_commands_codify.py -v",
"failed": 0,
"passed": 11,
"skipped": 1
}
},
"id": "fn-51.5",
"priority": null,
"spec_path": ".flow/tasks/fn-51.5.md",
"status": "todo",
"status": "done",
"title": "Add CLI Test Generation Command (dataing codify)",
"updated_at": "2026-01-28T22:23:24.514588Z"
"updated_at": "2026-01-31T15:23:04.372798Z"
}
16 changes: 13 additions & 3 deletions .flow/tasks/fn-51.5.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,19 @@ SELECT COUNT(*) as null_count...
- [ ] Exit code 0 if tests generated, 1 if no testable assertions found
- [ ] CLI tests pass
## Done summary
TBD

- Added `dataing codify` CLI command to dataing-cli package
- Command extracts tests from investigation synthesis and renders to multiple formats
- Supports --format option: gx, dbt, soda, sql (default: sql)
- Supports --output option to write to file (default: stdout)
- Supports --append option to merge with existing files (dbt schema.yml, GX suite)
- Intelligent merging for dbt and GX formats when appending
- Uses dataing.renderers for test rendering (requires dataing package)
- Handles synthesis parsing with required fields (estimated_onset, affected_scope)
- Error handling for missing synthesis or low confidence investigations
- Exit codes: 0 success, 1 no tests found/error
- Unit tests: 11 passing, 1 skipped (typer test quirk)
- Pre-commit clean (ruff, mypy)
## Evidence
- Commits:
- Tests:
- Tests: passed, skipped, failed, command
- PRs:
41 changes: 37 additions & 4 deletions .flow/tasks/fn-51.6.json
Original file line number Diff line number Diff line change
@@ -1,16 +1,49 @@
{
"assignee": null,
"assignee": "bordumbb@gmail.com",
"claim_note": "",
"claimed_at": null,
"claimed_at": "2026-01-31T15:29:27.703195Z",
"created_at": "2026-01-28T22:13:49.984424Z",
"depends_on": [
"fn-51.5"
],
"epic": "fn-51",
"evidence": {
"api_endpoints": [
"GET /investigations/tests/stats",
"GET /investigations/tests/catches",
"POST /investigations/tests/adopt",
"POST /investigations/tests/run"
],
"checks": {
"mypy": "passed",
"ruff": "passed"
},
"files_created": [
"python-packages/dataing/src/dataing/services/test_tracking.py",
"python-packages/dataing/migrations/033_test_tracking.sql",
"python-packages/dataing/tests/unit/services/test_test_tracking.py"
],
"files_modified": [
"python-packages/dataing/src/dataing/entrypoints/api/routes/investigations.py"
],
"metrics_tracked": [
"tests_generated",
"tests_adopted",
"tests_run",
"issues_caught",
"adoption_rate",
"effectiveness_rate"
],
"tests": {
"command": "uv run pytest python-packages/dataing/tests/unit/services/test_test_tracking.py -v",
"failed": 0,
"passed": 11
}
},
"id": "fn-51.6",
"priority": null,
"spec_path": ".flow/tasks/fn-51.6.md",
"status": "todo",
"status": "done",
"title": "Track Test Adoption and Effectiveness",
"updated_at": "2026-01-28T22:23:24.894347Z"
"updated_at": "2026-01-31T15:33:06.535263Z"
}
16 changes: 13 additions & 3 deletions .flow/tasks/fn-51.6.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,19 @@ Track which generated tests are adopted and whether they catch future issues.
- [ ] Metric: "Test effectiveness rate" (failures caught / tests generated)
- [ ] Unit tests pass
## Done summary
TBD

- Created TestTrackingService for measuring codify effectiveness
- Tracks: tests generated, adopted, run count, failure count
- Metrics: adoption_rate, effectiveness_rate
- Database migration 033_test_tracking.sql with generated_tests and test_runs tables
- Updated codify endpoint to record test generation events
- Added API endpoints:
- GET /investigations/tests/stats - test tracking statistics
- GET /investigations/tests/catches - recent test failures caught
- POST /investigations/tests/adopt - mark test as adopted
- POST /investigations/tests/run - record test run result
- 11 unit tests passing
- Ruff/mypy checks pass
## Evidence
- Commits:
- Tests:
- Tests: passed, failed, command
- PRs:
Loading