fix(self-reflection): robust JSON extraction with markdown fallback by anabelle · Pull Request #78 · anabelle/pixel-agent

anabelle · 2025-12-09T00:48:24Z

Summary

Fixes #28 - Self-Reflection Context Accumulation Issues

Problem

The self-reflection system suffered from context accumulation issues:

Weak JSON Extraction: Only tried to parse first JSON block, failed silently on malformed responses
No Markdown Fallback: Unlike the fixed timeline lore system, no fallback for markdown-formatted LLM responses
Missing Validation: No structural validation of reflection data before including in context
Insufficient Prompt Emphasis: JSON output requirements were not strict enough

Solution

1. Enhanced JSON Extraction with Validation

_extractJson() now validates that parsed JSON has required fields (strengths, weaknesses, recommendations)
Falls back to markdown extraction if JSON is malformed or missing fields

2. Markdown Fallback Extraction

New _extractFieldsFromMarkdown() method parses common markdown patterns:
- Headers with bullet points (### Strengths:, - item)
- Alternative labels (What you're doing well:, Actionable changes:)
- Quoted examples (Best reply: "text")
- Unicode bullets (•), asterisks (*), hyphens (-)

3. Validation Methods

_isValidReflection(): Strict validation requiring all three arrays
_hasMinimalReflectionData(): Lenient validation for markdown (requires 2+ populated fields)
getReflectionHistory() now filters using these validators

4. Strengthened Prompt

Changed OUTPUT JSON ONLY: to explicit requirements:

IMPORTANT: OUTPUT VALID JSON ONLY - NO MARKDOWN, NO EXPLANATIONS, NO CODE BLOCKS.
Your entire response must be a single valid JSON object with this exact structure:

Testing

Added 21 new tests in selfReflection.extraction.test.js
Tests cover:
- Valid JSON extraction
- JSON embedded in text
- JSON validation failures
- Markdown header/bullet extraction
- Alternative label formats
- Quoted example extraction
- Improvements/regressions extraction
- Edge cases (null, undefined, empty)
- Integration scenarios (mixed JSON/markdown)
All 343 tests pass

Impact

Prevents context loss when LLM returns non-JSON responses
Ensures longitudinal self-reflection works correctly
Aligns with the timeline lore system's robust extraction approach

Summary by CodeRabbit

New Features
- Add markdown-based extraction so reflections can be parsed from free-form markdown when JSON is missing or incomplete.
- Switch to a JSON-first extraction flow with graceful markdown fallback when needed.
Bug Fixes
- Tighten filtering to exclude reflections lacking minimal valid analysis.
- Improve handling and logging for malformed or incomplete reflection data.
Tests
- Add extensive tests covering JSON and markdown extraction, validation, and fallback scenarios.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Fixes #28 - Self-Reflection Context Accumulation Issues Changes: - Enhanced _extractJson() with markdown fallback when JSON parsing fails - Added _extractFieldsFromMarkdown() to parse markdown-formatted LLM responses - Added _isValidReflection() for strict JSON structure validation - Added _hasMinimalReflectionData() for lenient markdown validation - Updated getReflectionHistory() to filter out malformed reflections - Strengthened prompt output requirements to emphasize JSON-only output Testing: - Added 21 new tests for extraction methods - Tests cover: valid JSON, embedded JSON, markdown headers/bullets, quoted examples, edge cases, and mixed format scenarios - All 343 tests pass

coderabbitai · 2025-12-09T00:48:34Z

Walkthrough

Switch reflection extraction to JSON-first parsing with a markdown fallback, add markdown parsing to build structured reflections, and introduce validators to filter reflections missing required or minimal data. Tests for JSON and markdown extraction and validators were added.

Changes

Cohort / File(s)	Summary
Extraction & Validation (core) `plugin-nostr/lib/selfReflection.js`	Add `_extractFieldsFromMarkdown(text)`, `_isValidReflection(analysis)`, and `_hasMinimalReflectionData(analysis)`; change extraction flow to try JSON parse first, then fall back to markdown parsing when JSON is missing/invalid; skip stored reflections that fail minimal-data checks.
Tests `plugin-nostr/test/selfReflection.extraction.test.js`	Add comprehensive unit tests covering JSON extraction (valid/invalid), markdown fallback (headers, bullets, semicolons, quoted examples), validator edge cases, and integration scenarios (JSON preferred, markdown fallback).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Inspect regexes and parsing logic in _extractFieldsFromMarkdown for edge cases (bullet types, quoted blocks, short items).
Verify _isValidReflection and _hasMinimalReflectionData correctly enforce required arrays and minimal-field acceptance.
Confirm integration points where extraction is used (history processing) now skip invalid reflections without breaking flow.
Run the new test suite and check for flaky cases around whitespace/Unicode bullets.

Possibly related issues

#28: Self-Reflection Context Accumulation Issues — This PR implements JSON-first + markdown fallback and stronger validation, addressing the core causes described (weak JSON extraction, no markdown fallback, insufficient filtering).
Test coverage for selfReflection.js (39.89% → 100%) #53 — Matches same module and methods added (_extractFieldsFromMarkdown, validation helpers) and increases test coverage for extraction/fallback behavior.

Poem

🐰 I nibble on lines both strict and free,
I parse the JSON, then the markdown spree,
I fetch the strengths, the fixes, and the why,
Saving wise hops so lessons don't fly—
A rabbit’s reflection, tidy and spry.

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding robust JSON extraction with markdown fallback to the self-reflection system.
Linked Issues check	✅ Passed	All primary objectives from issue #28 are met: JSON extraction enhanced with validation, markdown fallback implemented with field extraction, reflection history filtered with validators, and prompt strengthened.
Out of Scope Changes check	✅ Passed	All changes directly address issue #28 requirements; no unrelated modifications detected in JSON/markdown extraction, validation, filtering, or test coverage.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/self-reflection-context-accumulation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

plugin-nostr/lib/selfReflection.js (1)
1359-1384: Regex pattern may cause ReDoS with crafted input.

The extractListItems function builds a regex from a pattern and applies it to potentially large, untrusted LLM output. The nested quantifiers ([^\\n]+(?:\\n[-*•]\\s*[^\\n]+)*) combined with the gi flags could cause catastrophic backtracking on adversarial input.

Consider adding input length limits or using a simpler, iterative parsing approach:
   const extractListItems = (pattern) => {
     const matches = [];
+    // Limit input to prevent ReDoS on large responses
+    const limitedText = text.slice(0, 10000);
     const regex = new RegExp(pattern + '[:\\s]*([^\\n]+(?:\\n[-*•]\\s*[^\\n]+)*)', 'gi');
-    const match = text.match(regex);
+    const match = limitedText.match(regex);
plugin-nostr/test/selfReflection.extraction.test.js (1)
207-251: Consider adding test for very short items being filtered.

The implementation filters items with length > 3. A test verifying this behavior would document the intentional filtering of short/noise items.
+    it('filters out very short items (length <= 3)', () => {
+      const text = `Strengths:
+- OK
+- A valid strength item
+- Yes`;
+
+      const result = engine._extractFieldsFromMarkdown(text);
+      expect(result.strengths).toEqual(['A valid strength item']);
+      expect(result.strengths).not.toContain('OK');
+      expect(result.strengths).not.toContain('Yes');
+    });

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b85ac19 and 8b5fe6b.

📒 Files selected for processing (2)

plugin-nostr/lib/selfReflection.js (3 hunks)
plugin-nostr/test/selfReflection.extraction.test.js (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

plugin-nostr/test/selfReflection.extraction.test.js (1)

plugin-nostr/lib/selfReflection.js (3)

require (1-1)

require (2-2)

require (3-3)

🔇 Additional comments (11)

plugin-nostr/lib/selfReflection.js (6)

489-492: LGTM! Good validation filtering.

The reflection history now properly filters out invalid reflections using _hasMinimalReflectionData, which aligns with the PR objective to ensure only valid reflections are included in context.

1295-1296: LGTM! Prompt strengthening addresses root cause.

The explicit "NO MARKDOWN, NO EXPLANATIONS, NO CODE BLOCKS" instruction directly addresses the issue of LLMs returning non-JSON responses.

1312-1340: LGTM! Solid JSON-first extraction with fallback.

The extraction flow correctly:

Attempts JSON parsing first

Validates parsed JSON has required fields

Falls back to markdown extraction when JSON fails or is incomplete

Returns null only when both paths fail

The debug logging at each step aids troubleshooting.

1395-1403: Good handling of curly/straight quotes for example extraction.

The regex correctly handles both straight quotes (") and Unicode curly quotes (\u201c, \u201d), which is important since LLMs may output either format.

1408-1418: LGTM! Clear strict validation logic.

_isValidReflection enforces the required contract: strengths, weaknesses, and recommendations must all be arrays. This matches the PR objective for strict JSON validation.

1420-1428: LGTM! Lenient validation for markdown fallback.

_hasMinimalReflectionData provides appropriate flexibility for markdown-extracted data by requiring only 2+ populated fields, acknowledging that markdown responses may not have all fields.

plugin-nostr/test/selfReflection.extraction.test.js (5)

1-11: LGTM! Clean test setup.

Minimal mock runtime is appropriate for unit testing the extraction logic. Using console as the logger is acceptable for tests.

13-60: LGTM! Good coverage of JSON extraction scenarios.

Tests cover the happy path, embedded JSON in text, and validation failure for missing required fields. The assertions verify the correct fields are extracted.

62-162: LGTM! Comprehensive markdown fallback tests.

Good coverage of:

Headers with bullet points

Alternative label formats ("What you're doing well")

Quoted example extraction

Improvements and regressions

Insufficient data returns null

These tests validate the PR's core objective of handling non-JSON LLM responses.

164-205: LGTM! Validation helper tests cover edge cases.

Both _isValidReflection and _hasMinimalReflectionData are tested with valid inputs, missing arrays, null, empty objects, and boundary conditions (exactly 2 fields vs. 1 field).

253-314: LGTM! Integration tests verify prioritization logic.

Tests correctly verify that:

Valid JSON is preferred over markdown

Malformed JSON triggers markdown fallback

JSON with missing required fields triggers markdown fallback

This matches the intended behavior from the PR objectives.

plugin-nostr/lib/selfReflection.js

- Add input length limit (10KB) to prevent ReDoS on large responses - Fix regex pattern to prevent 'your' from matching 'you're/youre' - Add tests for short item filtering (length > 3) - Add test for 'your' vs 'you're' distinction All 345 tests pass.

anabelle · 2025-12-09T00:58:00Z

Addressed CodeRabbit Feedback

Fixed all issues from the review:

1. ⚠️ Critical: Regex Pattern Bug (Fixed)

Changed from you'?re? to explicit (?:you're|youre) to prevent 'your' from matching. Added test to verify.

2. 🧹 Nitpick: ReDoS Prevention (Fixed)

Added input length limit: const limitedText = text.length > 10000 ? text.slice(0, 10000) : text;

3. 🧹 Nitpick: Short Item Filtering Test (Added)

Added test verifying items with length ≤ 3 are filtered out.

All 345 tests pass ✅

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

plugin-nostr/lib/selfReflection.js (1)

1342-1409: LGTM! Well-implemented markdown extraction with proper safeguards.

The markdown extraction logic is solid:

ReDoS protection via 10,000-char limit (line 1360)

Comprehensive regex patterns covering various markdown formats and alternative labels

Extraction of both bullet lists and quoted examples

Proper handling of fancy quotes in addition to standard quotes

The regex pattern at line 1390 correctly requires the complete "you're" or "youre" string, preventing false matches with "your". The inline comment correctly documents this fix.

Optional: Consider edge case in comma/semicolon splitting.

Lines 1380-1381 split non-bullet content on , or ;, which could incorrectly split items containing these punctuation marks (e.g., "Be concise, clear, and direct" → ["Be concise", "clear", "and direct"]). This is likely an acceptable tradeoff for the fallback extraction, but you may want to add a minimum length check or only split if multiple substantial phrases are detected.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b5fe6b and 87343c5.

📒 Files selected for processing (2)

plugin-nostr/lib/selfReflection.js (3 hunks)
plugin-nostr/test/selfReflection.extraction.test.js (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

plugin-nostr/test/selfReflection.extraction.test.js

🔇 Additional comments (5)

plugin-nostr/lib/selfReflection.js (5)

489-490: LGTM! Improved reflection history filtering.

The addition of _hasMinimalReflectionData validation ensures only reflections with meaningful content are included in the context, preventing accumulation of malformed entries. This directly addresses the context accumulation issues described in the linked issue.

1295-1296: LGTM! Strengthened prompt instructions.

The explicit requirement for a single valid JSON object with no markdown, explanations, or code blocks directly addresses the root cause identified in issue #28. This defensive prompt engineering should reduce the frequency of markdown responses.

1317-1340: LGTM! Robust extraction flow with proper fallback.

The enhanced extraction logic correctly implements the JSON-first approach with markdown fallback:

Attempts JSON extraction and validation

Falls back to markdown parsing on failure

Uses appropriate validators for each format (_isValidReflection for JSON, _hasMinimalReflectionData for markdown)

Logs each step for debugging

This addresses the core issue of silent failures when LLM returns non-JSON responses.

1411-1421: LGTM! Appropriate strict validation for JSON reflections.

The strict validation requiring strengths, weaknesses, and recommendations arrays ensures JSON-parsed reflections have the complete expected structure. This validator is appropriately strict for JSON input while allowing the more lenient _hasMinimalReflectionData for markdown-extracted content.

1423-1431: LGTM! Appropriately lenient validation for markdown extraction.

The lenient validation requiring at least 2 populated fields strikes a good balance for markdown-extracted content. This allows reflections with partial data to be preserved while filtering out truly empty or malformed entries. The threshold of 2 fields is reasonable and prevents loss of valuable insights when the LLM returns markdown instead of JSON.

anabelle mentioned this pull request Dec 9, 2025

Self-Reflection Context Accumulation Issues #28

Closed

coderabbitai bot reviewed Dec 9, 2025

View reviewed changes

plugin-nostr/lib/selfReflection.js Outdated Show resolved Hide resolved

fix: address CodeRabbit feedback

87343c5

- Add input length limit (10KB) to prevent ReDoS on large responses - Fix regex pattern to prevent 'your' from matching 'you're/youre' - Add tests for short item filtering (length > 3) - Add test for 'your' vs 'you're' distinction All 345 tests pass.

coderabbitai bot reviewed Dec 9, 2025

View reviewed changes

jorparad merged commit 91681a3 into master Dec 9, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(self-reflection): robust JSON extraction with markdown fallback#78

fix(self-reflection): robust JSON extraction with markdown fallback#78
jorparad merged 2 commits intomasterfrom
fix/self-reflection-context-accumulation

anabelle commented Dec 9, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 9, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

anabelle commented Dec 9, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

anabelle commented Dec 9, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

1. Enhanced JSON Extraction with Validation

2. Markdown Fallback Extraction

3. Validation Methods

4. Strengthened Prompt

Testing

Impact

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

anabelle commented Dec 9, 2025

Addressed CodeRabbit Feedback

1. ⚠️ Critical: Regex Pattern Bug (Fixed)

2. 🧹 Nitpick: ReDoS Prevention (Fixed)

3. 🧹 Nitpick: Short Item Filtering Test (Added)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

anabelle commented Dec 9, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 9, 2025 •

edited

Loading