🎯 Topic-FlipRAG: Topic-Oriented Adversarial Opinion Manipulation Attacks on RAG Models

Core implementation of Paper:
Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval-Augmented Generation Models

🧠 Overview

This repository contains the full implementation of Topic-FlipRAG, a novel black-box adversarial attack framework against Retrieval-Augmented Generation (RAG) systems. By leveraging general language knowledge and reverse-gradient signals, it optimizes a small number of poisoned documents to effectively flip the opinion stance of the RAG system across an entire set of topic-related queries.

📂 Repository Structure

Stage1_knowledge_guided_attack.ipynb
Includes the core implementation of the knowledge-guided attack, which leverages LLM-inferred general knowledge to perform multi-granularity document modifications (doc_know generation).
Stage2_adversarial_trigger_generation.ipynb
Optimizes minimal triggers to attach to doc_know for final poisoned documents. Includes formatting scripts for downstream poisoning tasks.
RAG_pipeline.ipynb
Builds a full RAG system (retriever + database + LLM) and evaluates poisoning effects. Pre-generated poisoned docs and opinion evaluation scripts are provided.
Data
- PROCON_data.json: The opinion dataset used in the paper.
- Example poisoned documents: data/example_adversarial_docs/Topic-FlipRAG_society_CON_passges/ — used in RAG_pipeline.ipynb.
- Example doc_know file: data/example_adversarial_docs/know_attack_data_3_0.json — used in Stage2_adversarial_trigger_generation.ipynb to demonstrate the trigger generation process.

🚀 Quick Start

This project is Colab-friendly. You only need to replace paths in the Jupyter notebooks to point to the corresponding files in the data/ directory. OpenAI API is required for Stage1_knowledge_guided_attack.ipynb and the RAG_pipeline.ipynb.

🔧 Colab Notebooks

Stage 1 – Knowledge-guided Attack
💡 Recommended GPU: T4
Stage 2 – Adversarial Trigger Generation

➤ Optimizes adversarial triggers based on Stage 1 outputs.
⮕ To skip entire Stage 1 procession, directly set: (This is a pre-generated example for fast evaluation.)
```
path_know = 'data/example_adversarial_docs/know_attack_data_3_0.json'
```
💡 Recommended GPU: T4
RAG Pipeline – Execution & Evaluation

➤ Runs the full RAG system and evaluates poisoned document impact.
⮕ To skip Stage 2, replace result_path in load_data() with a sample file from:
```
data/example_adversarial_docs/Topic-FlipRAG_society_CON_passges/
```
(These are pre-generated adversarial examples in the society domain targeting the CON stance.)
💡 Recommended GPU: A100
🔁 Use Google Drive for hosting large poisoned files if needed.

💡 Note

To facilitate quick testing, we provide a subset of poisoned documents located in data/Topic-FlipRAG_society_CON_passges/, specifically targeting the "Society & Culture" domain with a CON (oppose) stance. For full-scale evaluation, you can modify the code to load the entire dataset from PROCON_data.json.

📎 Citation

If you find this work useful, please cite:

@article{gong2025topic,
  title={Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval-Augmented Generation Models},
  author={Gong, Yuyang and Chen, Zhuo and Chen, Miaokun and Yu, Fengchang and Lu, Wei and Wang, Xiaofeng and Liu, Xiaozhong and Liu, Jiawei},
  journal={arXiv preprint arXiv:2502.01386},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
LICENSE		LICENSE
RAG_pipeline.ipynb		RAG_pipeline.ipynb
README.md		README.md
Stage1_knowledge_guided_attack.ipynb		Stage1_knowledge_guided_attack.ipynb
Stage2_adversarial_trigger_generation.ipynb		Stage2_adversarial_trigger_generation.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎯 Topic-FlipRAG: Topic-Oriented Adversarial Opinion Manipulation Attacks on RAG Models

🧠 Overview

📂 Repository Structure

🚀 Quick Start

🔧 Colab Notebooks

💡 Note

📎 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

LauJames/Topic-FlipRAG

Folders and files

Latest commit

History

Repository files navigation

🎯 Topic-FlipRAG: Topic-Oriented Adversarial Opinion Manipulation Attacks on RAG Models

🧠 Overview

📂 Repository Structure

🚀 Quick Start

🔧 Colab Notebooks

💡 Note

📎 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages