Design partner: hardening Haystack RAG against context (prompt) injection #10503

aeris-systems · 2026-02-04T05:21:05Z

aeris-systems
Feb 4, 2026

Hi Haystack team,

Haystack’s traction in enterprise settings (and the customer logos you’ve shared publicly, e.g. Apple/Airbus) makes the security/compliance story especially important.

I’m Alex (aeris-systems), ex‑VP Eng at Cloudflare (18y security). I built Aeris PromptShield (OSS) — prompt injection detection focused on the attacks we’re now seeing in production RAG/agent systems: indirect prompt injection / context injection hidden inside documents, web pages, PDFs, or seemingly “safe” retrieved chunks.

Why this matters for Haystack users: even when access control, PII redaction, and logging are strong, context injection can still change the agent’s behavior (override system intent, coerce tool usage, or cause data exfiltration) while looking like normal retrieved content.

Design-partner proposal (not a sales pitch): collaborate on a first-class integration pattern for Haystack pipelines, e.g.

scan/sanitize retrieved Document content before it’s assembled into prompts,
guardrail for tool/function calls downstream of RAG,
structured “why flagged” metadata suitable for audits.

If you’d be open to it, I’d love to contribute an example pipeline + docs that show how to harden a typical Haystack RAG app against context injection.

— Alex
https://github.com/aeris-systems/aeris-promptshield

kacperlukawski · 2026-02-06T10:14:10Z

kacperlukawski
Feb 6, 2026
Maintainer

Thanks for reaching out @aeris-systems We can see how this would be relevant for production RAG systems.

If you want to move forward, a draft PR with a proposed component interface and basic example would be useful, so we can iterate on the possible integration. That would let us provide feedback and understand whether there would be any changes in the core required. Feel free to open a PR if you'd like to explore this further.

0 replies

aniruddhaadak80 · 2026-03-09T22:53:39Z

aniruddhaadak80
Mar 9, 2026

From my point of view, the most important insertion point is after retrieval and before prompt assembly, because that is exactly where malicious document content stops being data and starts influencing behavior. If the integration happens only at input or output time, a lot of context-injection risk still slips through the middle.

The other practical question is latency and explainability. Enterprise teams will want to know not only that something was flagged, but why and at what cost.

0 replies

Nyrok · 2026-03-11T07:39:04Z

Nyrok
Mar 11, 2026

The injection surface you're describing — retrieved content overriding system intent — is partly a structural problem at the prompt layer. When the instruction set is prose mixed into a single system prompt block, the model has a fuzzier parse of where "what I was told to do" ends and "content I'm processing" begins.

Typed semantic blocks help with this. If the role, constraints, and objective are explicit, labeled XML fields rather than inline prose, the model has a clearer instruction boundary. That doesn't replace a shield layer, but it narrows the surface that injection has to target.

I've been building flompt for exactly this, a visual prompt builder that decomposes prompts into 12 semantic blocks and compiles to Claude-optimized XML. Open-source: github.com/Nyrok/flompt

The structural separation (instructions vs. content) is something worth designing into the pipeline architecture, not just the detection layer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design partner: hardening Haystack RAG against context (prompt) injection #10503

Uh oh!

{{title}}

Uh oh!

Replies: 8 comments

Uh oh!

{{title}}

Uh oh!

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Design partner: hardening Haystack RAG against context (prompt) injection #10503

Uh oh!

aeris-systems Feb 4, 2026

Replies: 8 comments

Uh oh!

kacperlukawski Feb 6, 2026 Maintainer

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

Uh oh!

aniruddhaadak80 Mar 9, 2026

Uh oh!

Nyrok Mar 11, 2026

aeris-systems
Feb 4, 2026

kacperlukawski
Feb 6, 2026
Maintainer

aniruddhaadak80
Mar 9, 2026

Nyrok
Mar 11, 2026