Skip to content

Feature: Post-RAG FactChecker Pipeline Component & Header-Aware Document Splitter #10973

@DYNOSuprovo

Description

@DYNOSuprovo

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...Yes. When building production RAG pipelines in Haystack, the pipeline typically ends with a Generator component (like OpenAIGenerator). While Haystack does a great job retrieving context, there is no native, deterministic post-processing component to audit the generated answer for hallucinations before returning it to the user.

Currently, if I want to guarantee that every sentence in the generated output is grounded in the retrieved Documents, I have to build a custom external loop to re-evaluate the output against the source context, which breaks the clean, linear flow of a Haystack Pipeline. Additionally, standard DocumentSplitter components often severe context by splitting paragraphs away from their Markdown headers strictly based on character/word counts.
]

Describe the solution you'd like
I would love to see two new components added to the Haystack ecosystem:

  1. FactChecker (or VerificationEvaluator) Component:
    A new Pipeline component designed to sit immediately after a Generator. It takes two inputs: the generated replies (String) and the documents (List[Document]). It uses a secondary LLM strictly as a judge to cross-reference the reply against the documents, effectively scoring it or stripping out unsupported claims. It would output a VerifiedReply and a ConfidenceScore.

  2. HeaderAwareDocumentSplitter Component:
    An enhancement or alternative to DocumentSplitter that parses Markdown/HTML AST. It refuses to split a chunk if it separates a heading (#, ##) from its immediate paragraph, ensuring that retrieved chunks retain their structural context.

Describe alternatives you've considered
I have considered using Haystack's AnswerBuilder with reference_pattern, but that simply matches regex citations, it doesn't deterministically verify if the LLM hallucinated the claim in the first place.

I have also considered using offline evaluation frameworks (like DeepEval or Ragas), but those are meant for testing datasets during development. I need a runtime component that actively blocks or warns users about unverified answers within the live production Pipeline. Currently, I am forced to write a Custom Component to handle this.

Additional context
I recently built a custom extraction pipeline in raw Python solving this exact problem, utilizing a multi-model approach (a fast 8B model for routing, and a 70B model for the final verification stage).

I am highly motivated to bring this pattern to Haystack. I am willing to write the PR, unit tests, and documentation for either the FactChecker component or the HeaderAwareDocumentSplitter if the core maintainers believe this aligns with Haystack's vision for production-ready RAG.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium priority, add to the next sprint if no P1 available

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions