Skip to content

feat(benchmark): add Docling validation harness#32

Merged
kpeez merged 3 commits intomainfrom
p3
Mar 9, 2026
Merged

feat(benchmark): add Docling validation harness#32
kpeez merged 3 commits intomainfrom
p3

Conversation

@kpeez
Copy link
Copy Markdown
Owner

@kpeez kpeez commented Mar 9, 2026

This pull request introduces a comprehensive Docling validation benchmark harness under backend/paperchat/benchmarks/docling_validation/. It provides a CLI for running document conversion and chunking benchmarks, data models for fixtures and results, manifest loading utilities, a retrieval scoring module, and an adapter for interfacing with the Docling library. It also adds a small exclusion to the pre-commit config and a module docstring.

Docling validation harness and CLI:

  • Added a new CLI (cli.py) to run the Docling validation benchmark, supporting arguments for fixtures, queries, output directory, and top-k metrics.
  • Introduced __init__.py and a module-level docstring for the benchmark tooling. [1] [2]

Core logic and integration with Docling:

  • Implemented docling_adapter.py, which loads Docling components, runs document conversion and chunking, normalizes chunk data, and handles warnings/errors robustly.

Data models and manifest handling:

  • Defined structured dataclasses in models.py for fixture documents, queries, chunking results, summaries, and recommendations, ensuring type safety and clarity.
  • Added manifests.py to load and validate fixture and query manifests from JSON files, with error handling for missing or malformed data.

Retrieval and scoring utilities:

  • Created retrieval.py to tokenize text, compute term frequencies, cosine similarity, and rank chunks for retrieval evaluation.

Tooling and configuration:

  • Updated .pre-commit-config.yaml to exclude large PDF files in test fixtures from pre-commit checks.

@kpeez kpeez merged commit e272296 into main Mar 9, 2026
0 of 2 checks passed
@kpeez kpeez deleted the p3 branch March 9, 2026 00:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant