Design partner: hardening Haystack RAG against context (prompt) injection #10503
Replies: 8 comments
-
|
Thanks for reaching out @aeris-systems We can see how this would be relevant for production RAG systems. If you want to move forward, a draft PR with a proposed component interface and basic example would be useful, so we can iterate on the possible integration. That would let us provide feedback and understand whether there would be any changes in the core required. Feel free to open a PR if you'd like to explore this further. |
Beta Was this translation helpful? Give feedback.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
-
|
From my point of view, the most important insertion point is after retrieval and before prompt assembly, because that is exactly where malicious document content stops being data and starts influencing behavior. If the integration happens only at input or output time, a lot of context-injection risk still slips through the middle. The other practical question is latency and explainability. Enterprise teams will want to know not only that something was flagged, but why and at what cost. |
Beta Was this translation helpful? Give feedback.
-
|
The injection surface you're describing — retrieved content overriding system intent — is partly a structural problem at the prompt layer. When the instruction set is prose mixed into a single system prompt block, the model has a fuzzier parse of where "what I was told to do" ends and "content I'm processing" begins. Typed semantic blocks help with this. If the role, constraints, and objective are explicit, labeled XML fields rather than inline prose, the model has a clearer instruction boundary. That doesn't replace a shield layer, but it narrows the surface that injection has to target. I've been building flompt for exactly this, a visual prompt builder that decomposes prompts into 12 semantic blocks and compiles to Claude-optimized XML. Open-source: github.com/Nyrok/flompt The structural separation (instructions vs. content) is something worth designing into the pipeline architecture, not just the detection layer. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Haystack team,
Haystack’s traction in enterprise settings (and the customer logos you’ve shared publicly, e.g. Apple/Airbus) makes the security/compliance story especially important.
I’m Alex (aeris-systems), ex‑VP Eng at Cloudflare (18y security). I built Aeris PromptShield (OSS) — prompt injection detection focused on the attacks we’re now seeing in production RAG/agent systems: indirect prompt injection / context injection hidden inside documents, web pages, PDFs, or seemingly “safe” retrieved chunks.
Why this matters for Haystack users: even when access control, PII redaction, and logging are strong, context injection can still change the agent’s behavior (override system intent, coerce tool usage, or cause data exfiltration) while looking like normal retrieved content.
Design-partner proposal (not a sales pitch): collaborate on a first-class integration pattern for Haystack pipelines, e.g.
Documentcontent before it’s assembled into prompts,If you’d be open to it, I’d love to contribute an example pipeline + docs that show how to harden a typical Haystack RAG app against context injection.
— Alex
https://github.com/aeris-systems/aeris-promptshield
Beta Was this translation helpful? Give feedback.
All reactions