Fix deep-inspection analysis enrichment pipeline and article quality#1539
Fix deep-inspection analysis enrichment pipeline and article quality#1539
Conversation
…ocol to workflows and guides Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/3921d249-94f2-44df-b1a0-c42bf9e725ad Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
… from per-doc analysis Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/3921d249-94f2-44df-b1a0-c42bf9e725ad Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/3921d249-94f2-44df-b1a0-c42bf9e725ad Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Replace empty skeleton files (0 documents analyzed) with full multi-framework political intelligence analysis covering 3 major government propositions (HD03214, HD03228, HD03235). Each file includes color-coded Mermaid diagrams, evidence tables, confidence labels, and template metadata headers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…altime-monitor, news-evening-analysis Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/977300d8-14a1-474b-8775-3c47bc1bc48d Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/977300d8-14a1-474b-8775-3c47bc1bc48d Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
…ticles with SWOT/risk charts Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/542d8dbf-ed87-4448-95a1-2cc8e99b6922 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
…Sync Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/542d8dbf-ed87-4448-95a1-2cc8e99b6922 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🏷️ Automatic Labeling SummaryThis PR has been automatically labeled based on the files changed and PR metadata. Applied Labels: documentation,html-css,workflow,translation,ci-cd,refactor,size-xl,news,agentic-workflow Label Categories
For more information, see |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
There was a problem hiding this comment.
Pull request overview
Fixes missing enrichment in the news analysis pipeline by enabling analysis discovery in per-type subdirectories and by allowing deep-inspection to include targeted documents by dok_id, resulting in higher-quality generated articles and enriched batch analysis artifacts.
Changes:
- Update analysis reading to fall back to scanning immediate subdirectories when root-level daily analysis files are absent.
- Extend pre-article analysis CLI to accept
--document-idsto bypass strict date filtering (deep-inspection lookback). - Enrich deep-inspection + propositions analysis outputs, update EN/SV deep-inspection articles, and tighten workflow guidance/quality gates around empty batch files.
Reviewed changes
Copilot reviewed 32 out of 32 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/pre-article-analysis.ts | Adds --document-ids parsing and uses it to include/fetch targeted docs outside the date filter. |
| scripts/analysis-reader.ts | Adds subdirectory fallback for reading daily analysis files and for finding the latest analysis date. |
| news/2026-04-03-deep-inspection-cyber-security-cyber-threats-threat-land-sv.html | Updates metadata and injects SWOT + risk visualizations (canvas + tables) in SV article. |
| news/2026-04-03-deep-inspection-cyber-security-cyber-threats-threat-land-en.html | Updates metadata and injects SWOT + risk visualizations (canvas + tables) in EN article. |
| analysis/methodologies/ai-driven-analysis-guide.md | Documents the deep-inspection batch enrichment protocol (v4.1) and updates audit findings. |
| analysis/daily/2026-04-03/propositions/threat-analysis.md | Replaces skeleton batch threat analysis with enriched, template-aligned content. |
| analysis/daily/2026-04-03/propositions/synthesis-summary.md | Replaces skeleton synthesis with enriched intelligence dashboard + tables. |
| analysis/daily/2026-04-03/propositions/swot-analysis.md | Replaces skeleton SWOT with enriched quadrant mapping + evidence tables. |
| analysis/daily/2026-04-03/propositions/stakeholder-perspectives.md | Replaces skeleton stakeholder file with enriched 6-lens content + tables. |
| analysis/daily/2026-04-03/propositions/significance-scoring.md | Replaces skeleton scoring with enriched 5-dimension scoring + thresholds. |
| analysis/daily/2026-04-03/propositions/risk-assessment.md | Replaces skeleton risk assessment with enriched register + Mermaid risk map. |
| analysis/daily/2026-04-03/propositions/data-download-manifest.md | Rewrites manifest into a richer provenance/quality manifest aligned to templates. |
| analysis/daily/2026-04-03/propositions/cross-reference-map.md | Replaces skeleton cross-reference map with enriched relationship network + tables. |
| analysis/daily/2026-04-03/propositions/classification-results.md | Replaces skeleton classification results with enriched batch classification + rationale. |
| analysis/daily/2026-04-03/deep-inspection/threat-analysis.md | Enriches deep-inspection threat batch file from per-document analysis. |
| analysis/daily/2026-04-03/deep-inspection/synthesis-summary.md | Enriches deep-inspection synthesis batch file (dashboard, SWOT/risk summaries, inventory). |
| analysis/daily/2026-04-03/deep-inspection/swot-analysis.md | Enriches deep-inspection SWOT batch file with quadrant diagram + tables. |
| analysis/daily/2026-04-03/deep-inspection/stakeholder-perspectives.md | Enriches deep-inspection stakeholder batch file with 6-lens tables + Mermaid. |
| analysis/daily/2026-04-03/deep-inspection/significance-scoring.md | Enriches deep-inspection significance scoring with ranked table + scoring model. |
| analysis/daily/2026-04-03/deep-inspection/risk-assessment.md | Enriches deep-inspection risk assessment with risk matrix, register, trajectory, anomalies. |
| analysis/daily/2026-04-03/deep-inspection/data-download-manifest.md | Enriches deep-inspection manifest to reflect ID-targeting vs date filtering. |
| analysis/daily/2026-04-03/deep-inspection/cross-reference-map.md | Enriches deep-inspection cross-reference mapping with relationships + forecasts. |
| analysis/daily/2026-04-03/deep-inspection/classification-results.md | Enriches deep-inspection classification results with decision tree + mapping. |
| .github/workflows/SHARED_PROMPT_PATTERNS.md | Adds “Check 8” intended to fail the quality gate when batch files are empty despite per-doc analysis. |
| .github/workflows/news-week-ahead.md | Adds mandatory batch enrichment instructions to the workflow guide. |
| .github/workflows/news-realtime-monitor.md | Adds mandatory batch enrichment instructions to the workflow guide. |
| .github/workflows/news-propositions.md | Adds mandatory batch enrichment instructions to the workflow guide. |
| .github/workflows/news-motions.md | Adds mandatory batch enrichment instructions to the workflow guide. |
| .github/workflows/news-interpellations.md | Adds mandatory batch enrichment instructions to the workflow guide. |
| .github/workflows/news-evening-analysis.md | Adds mandatory batch enrichment instructions to the workflow guide. |
| .github/workflows/news-committee-reports.md | Adds mandatory batch enrichment instructions to the workflow guide. |
| .github/workflows/news-article-generator.md | Passes sanitized --document-ids for deep-inspection and documents batch enrichment requirements. |
Comments suppressed due to low confidence (1)
scripts/pre-article-analysis.ts:636
excludedDocsCountis computed after potentially pushing individually fetched documents intoallDocs. If any missing IDs are fetched,allDocs.lengthcan exceedflattenedDocs.length, makingexcludedDocsCountnegative and the log message incorrect. Compute excluded count before the individual fetch loop, or track bulk-selected vs id-fetched documents separately for accurate reporting.
| // Scan immediate subdirectories for the file (e.g., deep-inspection/, propositions/) | ||
| const dateDir = join(resolvedBase, date); | ||
| try { | ||
| const entries = await readdir(dateDir, { withFileTypes: true }); | ||
| for (const entry of entries) { | ||
| if (!entry.isDirectory()) continue; | ||
| const subPath = join(dateDir, entry.name, filename); | ||
| try { | ||
| return await readFile(subPath, 'utf-8'); | ||
| } catch { | ||
| // Not in this subdirectory — continue | ||
| } | ||
| } |
There was a problem hiding this comment.
Subdirectory fallback in readAnalysisFile returns the first matching file found in readdir() order. If multiple analysis subfolders exist for the same date (e.g., deep-inspection/ + propositions/), this can select an arbitrary doc-type’s analysis and enrich articles with the wrong metadata. Consider making the selection deterministic (e.g., sort entries) and/or adding an explicit subdir/docType parameter so callers can request the correct analysis scope.
| /** | ||
| * Attempt to read a markdown file from the analysis directory. | ||
| * First checks `analysis/daily/{date}/{filename}`, then scans immediate | ||
| * subdirectories (e.g., `deep-inspection/`, `propositions/`) for the file. | ||
| * Returns `null` if the file does not exist, cannot be read, or `date` is | ||
| * not a valid YYYY-MM-DD string (guards against path traversal). | ||
| */ | ||
| async function readAnalysisFile(date: string, filename: string, basePath?: string): Promise<string | null> { | ||
| if (!DATE_FORMAT_RE.test(date)) return null; | ||
| const resolvedBase = basePath ?? ANALYSIS_BASE_PATH; | ||
| const filePath = join(resolvedBase, date, filename); | ||
| const rootPath = join(resolvedBase, date, filename); | ||
| try { | ||
| return await readFile(filePath, 'utf-8'); | ||
| return await readFile(rootPath, 'utf-8'); | ||
| } catch { | ||
| return null; | ||
| // Root-level file not found — scan subdirectories | ||
| } | ||
|
|
||
| // Scan immediate subdirectories for the file (e.g., deep-inspection/, propositions/) | ||
| const dateDir = join(resolvedBase, date); | ||
| try { | ||
| const entries = await readdir(dateDir, { withFileTypes: true }); | ||
| for (const entry of entries) { | ||
| if (!entry.isDirectory()) continue; | ||
| const subPath = join(dateDir, entry.name, filename); | ||
| try { | ||
| return await readFile(subPath, 'utf-8'); | ||
| } catch { | ||
| // Not in this subdirectory — continue | ||
| } | ||
| } | ||
| } catch { | ||
| // Date directory doesn't exist or can't be read | ||
| } | ||
| return null; |
There was a problem hiding this comment.
The new subdirectory fallback behavior in readAnalysisFile/findLatestAnalysisDate is not covered by the existing tests/analysis-reader.test.ts integration tests (they only write root-level files). Add tests that place analysis files under a subdirectory (e.g., ${base}/${date}/propositions/…) and assert readDailyAnalysis and findLatestAnalysisDate still find and parse them (and behave deterministically if multiple subdirs exist).
| BATCH_FILE="$ANALYSIS_DIR/$bf" | ||
| [ ! -f "$BATCH_FILE" ] && continue | ||
| ZERO_DOCS=$(grep -cE "(Documents Analyzed|documents analyzed|Analyzed \*\*0|Scored \*\*0|for \*\*0|to \*\*0|across 0 documents|for 0 political)" "$BATCH_FILE" 2>/dev/null) || true | ||
| FILE_SIZE=$(wc -c < "$BATCH_FILE" 2>/dev/null) || true | ||
| if [ "${ZERO_DOCS:-0}" -gt 0 ]; then | ||
| echo "❌ FAIL: $bf reports '0 documents' but $PERDOC_COUNT per-doc analyses exist — MUST be enriched" | ||
| QUALITY_PASS=false |
There was a problem hiding this comment.
The ZERO_DOCS grep pattern will always match any batch file that contains the string "Documents Analyzed" (regardless of the count), so this check will fail even when documents were analyzed. Tighten the regex to only match zero cases (e.g., Documents Analyzed\*\*:\s*0 / Documents Analyzed:\s*0 / Analyzed \*\*0\*\*).
| # Per-document analysis exists — batch files MUST NOT report "0 documents analyzed" | ||
| for bf in synthesis-summary.md swot-analysis.md risk-assessment.md threat-analysis.md classification-results.md significance-scoring.md stakeholder-perspectives.md; do |
There was a problem hiding this comment.
Check 8’s batch-file loop omits some of the daily artifacts that are mentioned elsewhere as mandatory (notably cross-reference-map.md, and potentially data-download-manifest.md). If the goal is to prevent any empty batch artifacts from slipping through, include the full set of batch files (or document why only this subset is enforced).
| # Per-document analysis exists — batch files MUST NOT report "0 documents analyzed" | |
| for bf in synthesis-summary.md swot-analysis.md risk-assessment.md threat-analysis.md classification-results.md significance-scoring.md stakeholder-perspectives.md; do | |
| # Per-document analysis exists — all mandatory batch artifacts MUST NOT report "0 documents analyzed" | |
| for bf in synthesis-summary.md swot-analysis.md risk-assessment.md threat-analysis.md classification-results.md significance-scoring.md stakeholder-perspectives.md cross-reference-map.md data-download-manifest.md; do |
|
@copilot apply changes based on the comments in this thread |
…rep, add tests Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/b4dcc04a-d535-44bb-8759-6a3ae23ea60e Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Applied all 4 review comments in
|
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
Documents Analyzed**:\s*0)