feat(benchmark): add Docling validation harness by kpeez · Pull Request #32 · kpeez/paperchat

kpeez · 2026-03-09T00:13:04Z

This pull request introduces a comprehensive Docling validation benchmark harness under backend/paperchat/benchmarks/docling_validation/. It provides a CLI for running document conversion and chunking benchmarks, data models for fixtures and results, manifest loading utilities, a retrieval scoring module, and an adapter for interfacing with the Docling library. It also adds a small exclusion to the pre-commit config and a module docstring.

Docling validation harness and CLI:

Added a new CLI (cli.py) to run the Docling validation benchmark, supporting arguments for fixtures, queries, output directory, and top-k metrics.
Introduced __init__.py and a module-level docstring for the benchmark tooling. [1] [2]

Core logic and integration with Docling:

Implemented docling_adapter.py, which loads Docling components, runs document conversion and chunking, normalizes chunk data, and handles warnings/errors robustly.

Data models and manifest handling:

Defined structured dataclasses in models.py for fixture documents, queries, chunking results, summaries, and recommendations, ensuring type safety and clarity.
Added manifests.py to load and validate fixture and query manifests from JSON files, with error handling for missing or malformed data.

Retrieval and scoring utilities:

Created retrieval.py to tokenize text, compute term frequencies, cosine similarity, and rank chunks for retrieval evaluation.

Tooling and configuration:

Updated .pre-commit-config.yaml to exclude large PDF files in test fixtures from pre-commit checks.

kpeez added 3 commits March 8, 2026 17:08

feat(backend): add Docling benchmark entrypoint and runtime deps

96b3ae2

feat(benchmark): add Docling validation harness

16071ad

test(benchmark): add fixed paper fixtures with canonical queries

a87525d

kpeez merged commit e272296 into main Mar 9, 2026
0 of 2 checks passed

kpeez deleted the p3 branch March 9, 2026 00:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(benchmark): add Docling validation harness#32

feat(benchmark): add Docling validation harness#32
kpeez merged 3 commits intomainfrom
p3

kpeez commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kpeez commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant