research(context): EDU-based structured context compression (LingoEDU)

## Summary

Reformulates context compression as structure-then-select: decomposes text into Elementary Discourse Unit (EDU) relation trees, then selects query-relevant subtrees. Eliminates hallucination by anchoring EDUs strictly to source text indices.

**Source**: arXiv 2512.14244 — *From Context to EDUs: Faithful and Structured Context Compression via Elementary Discourse Unit Decomposition*
Published 2025-12-16, revised 2026-01-05.

## Key Results

- State-of-the-art structural prediction accuracy on StructBench (248 manually annotated documents)
- Outperforms frontier LLMs on structured compression while reducing cost
- Zero hallucination: EDU nodes anchored to source byte offsets

## Applicability to Zeph

Current Zeph compaction: `summarize_tool_outputs()` → token-based chunking → LLM summarization.

**Problem**: Chunking splits at arbitrary token boundaries, destroying discourse structure. Summaries lose causal/temporal relations between events.

**Enhancement**: Apply EDU decomposition as a preprocessing step in `SemanticMemory::compress()` before the compaction LLM call:
1. Parse tool output / conversation chunk into EDU tree (LingoEDU)
2. Score each EDU subtree by relevance to current task intent
3. Select top-K EDU subtrees (respect token budget) → pass to compaction LLM
4. Compaction LLM sees structured, non-redundant content → better summaries

**Synergy**: Complements #1851 (SWE-Pruner goal-guided pruning) and #1607 (structured anchored summarization). EDU tree provides the structure that #1607 assumes.

## Implementation Sketch

- `EduDecomposer` trait in `zeph-memory::compression`
- Initial implementation: regex-based clause splitting (lightweight, no dependency)
- Advanced: integrate LingoEDU parser (Python subprocess or Rust port)
- Config: `[memory.compression] edu_decomposition = false` (opt-in, experimental)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(context): EDU-based structured context compression (LingoEDU) #1863

Summary

Key Results

Applicability to Zeph

Implementation Sketch

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research(context): EDU-based structured context compression (LingoEDU) #1863

Description

Summary

Key Results

Applicability to Zeph

Implementation Sketch

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions