NoteForge is a proof-first clinical note generation demo for structured device-style measurements. Every supported sentence in the generated SOAP note is tied to an evidence packet with raw-row provenance, and unsupported claims are converted into explicit refusals instead of unsupported narrative.
Clinical documentation workflows are a poor fit for free-form summarization when the source data is sparse or noisy. NoteForge focuses on a narrower but more defensible problem: convert structured measurements into conservative note text only when the underlying evidence is strong enough to justify the claim.
- Deterministic note generation from structured CSV measurements
- Evidence packets with stable packet IDs and raw-row provenance
- Refusal behavior for missing or insufficient evidence
- Built-in evaluator that rejects unsupported or mislinked note text
- Streamlit demo with timeline view, evidence drill-down, and note inspection
- Default sample dataset for quick local startup
flowchart LR
A["CSV measurements"] --> B["Ingest + validation"]
B --> C["Timeline table + chart"]
B --> D["Evidence packet builder"]
D --> E["Rule-based claim engine"]
E --> F["SOAP note renderer"]
E --> G["Refusal engine"]
F --> H["Evaluator"]
G --> H
H --> I["Grounded note + metrics"]
- Python 3.13
- Streamlit
- Pandas
- Altair
- Pydantic
- Pytest
The repo includes six tests covering evidence construction, rule behavior, and evaluator failure modes. The evaluator explicitly checks that:
- every supported sentence links to known evidence packet IDs
- evidence types match the claim they are supporting
- unsupported note text raises an error instead of silently passing
Example output from the bundled sample CSV:
Objective:
Fever present within last 4 hours. [evidence: PACKET_TEMP_001]
Assessment:
Refusal: elevated_hr not supported. Reason: mean HR 81.0 bpm < 100 bpm. [evidence: PACKET_HR_001]
Refusal: hypoxemia_possible not supported. Reason: mean SpO2 missing; n_samples 0 < 3. [evidence: PACKET_SPO2_001]
- The demo only handles structured measurement inputs, not raw clinical notes or EHR exports.
- The note generator is intentionally conservative and covers a small rule set.
- This is not a clinical product and should not be used for care delivery.
py -3.13 -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
pip install -r requirements.txt
streamlit run app.pyThe app auto-loads data/sample.csv when no upload is provided.
Or use the bundled scripts:
powershell -ExecutionPolicy Bypass -File scripts/bootstrap.ps1
powershell -ExecutionPolicy Bypass -File scripts/run_demo.ps1pytestnoteforge/
app.py
app/
data/
note_forge/
scripts/
tests/
- Expand beyond a single vital-sign rule set into richer multi-signal reasoning
- Add schema adapters for device exports beyond the demo CSV format
- Introduce benchmark scenarios for missingness, conflicting evidence, and temporal edge cases
