|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": {}, |
| 6 | + "source": [ |
| 7 | + "# Adaptive Retrieval-Augmented Generation RAG\n", |
| 8 | + "\n", |
| 9 | + "This notebook introduces an adaptive Retrieval-Augmented Generation (RAG) system that tailors its document retrieval strategy based on the type of query. The approach integrates advanced language model prompting to classify queries and then dynamically selects the appropriate retrieval strategy, resulting in more accurate and context-aware responses." |
| 10 | + ] |
| 11 | + }, |
| 12 | + { |
| 13 | + "cell_type": "markdown", |
| 14 | + "metadata": {}, |
| 15 | + "source": [ |
| 16 | + "## System Introduction\n", |
| 17 | + "\n", |
| 18 | + "The system is designed to overcome the limitations of traditional one-size-fits-all RAG approaches by adapting its retrieval method to the query’s nature. It uses a series of LLM-powered prompts to first classify the query, then to generate enhanced or reformulated queries, and finally to rank and select the most relevant documents. This adaptive process aims to yield results that are not only precise but also rich in context, making it suitable for a diverse range of query types." |
| 19 | + ] |
| 20 | + }, |
| 21 | + { |
| 22 | + "cell_type": "markdown", |
| 23 | + "metadata": {}, |
| 24 | + "source": [ |
| 25 | + "## Underlying Concept\n", |
| 26 | + "\n", |
| 27 | + "Traditional RAG systems often retrieve documents in a generic manner without considering the specific needs of different query types. In contrast, this system starts by classifying a user’s query into one of four categories: \n", |
| 28 | + "\n", |
| 29 | + "- **Factual:** For queries seeking verifiable and specific information.\n", |
| 30 | + "- **Analytical:** For queries that require comprehensive analysis or explanation.\n", |
| 31 | + "- **Opinion:** For queries that involve subjective perspectives or diverse viewpoints.\n", |
| 32 | + "- **Contextual:** For queries that depend on user-specific context.\n", |
| 33 | + "\n", |
| 34 | + "By differentiating between these types, the system adapts its retrieval strategy to enhance query precision, generate sub-questions when necessary, and integrate user context. This leads to richer, more coherent results tailored to the query’s requirements." |
| 35 | + ] |
| 36 | + }, |
| 37 | + { |
| 38 | + "cell_type": "markdown", |
| 39 | + "metadata": {}, |
| 40 | + "source": [ |
| 41 | + "## System Components\n", |
| 42 | + "\n", |
| 43 | + "1. **Query Classification Module:**\n", |
| 44 | + " - Uses a templated prompt (`ADAPTIVE_QUERY_CLASSIFIER_PROMPT`) to classify the incoming query into one of the four categories.\n", |
| 45 | + " - Determines if additional context is required when the query is contextual.\n", |
| 46 | + "\n", |
| 47 | + "2. **Adaptive Retrieval Strategies:**\n", |
| 48 | + " - **Factual Strategy:** Enhances the query for precision, retrieves documents, and uses LLM-based ranking to select the top results.\n", |
| 49 | + " - **Analytical Strategy:** Generates multiple sub-queries for comprehensive coverage and applies diversity selection to ensure a broad analysis.\n", |
| 50 | + " - **Opinion Strategy:** Identifies distinct viewpoints and retrieves corresponding documents, then ranks them to cover a diverse range of opinions.\n", |
| 51 | + " - **Contextual Strategy:** Incorporates user-specific context to reformulate the query and ranks documents by considering both relevance and context.\n", |
| 52 | + "\n", |
| 53 | + "3. **LLM-Enhanced Ranking:**\n", |
| 54 | + " - Each retrieval strategy uses specialized ranking prompts (e.g., `ADAPTIVE_FACTUAL_RANK_PROMPT`, `ADAPTIVE_CONTEXTUAL_RANK_PROMPT`) to evaluate document relevance on a scale, ensuring that the most pertinent documents are selected.\n", |
| 55 | + "\n", |
| 56 | + "4. **Response Generation Module:**\n", |
| 57 | + " - Combines the selected documents into a final context and passes them, along with the original query, to an LLM (via `ADAPTIVE_FINAL_ANSWER_PROMPT`) to generate the final answer." |
| 58 | + ] |
| 59 | + }, |
| 60 | + { |
| 61 | + "cell_type": "markdown", |
| 62 | + "metadata": {}, |
| 63 | + "source": [ |
| 64 | + "## Database Implementation and Llama Index Integration\n", |
| 65 | + "\n", |
| 66 | + "The adaptive RAG system leverages the **Llama Index** as its underlying database for document ingestion, storage, and retrieval. This integration is evident in the use of the `retrieve_documents` function imported from `tools.rag.llama_index.retrieve`, which loads and queries a persistent vector store maintained by Llama Index.\n", |
| 67 | + "\n", |
| 68 | + "By relying on Llama Index, the system benefits from automated document ingestion and vector indexing. Documents are processed using components like the `SimpleDirectoryReader` and, in other contexts, enriched with context windows using parsers like `SentenceWindowNodeParser`. This setup allows for efficient similarity search and ensures that the adaptive retrieval strategies (Factual, Analytical, Opinion, and Contextual) operate on a robust, scalable database.\n", |
| 69 | + "\n", |
| 70 | + "For further details on how documents are ingested and indexed, please refer to the Llama Index ingestion section. This section provides comprehensive guidance on configuring the ingestion pipeline and customizing parameters to suit your specific data and retrieval requirements." |
| 71 | + ] |
| 72 | + }, |
| 73 | + { |
| 74 | + "cell_type": "markdown", |
| 75 | + "metadata": {}, |
| 76 | + "source": [ |
| 77 | + "## How It Works\n", |
| 78 | + "\n", |
| 79 | + "### 1. Query Classification\n", |
| 80 | + "\n", |
| 81 | + "- The system starts by classifying the query using a prompt-based classifier. \n", |
| 82 | + "- If the query is classified as Contextual, any extracted user-specific context is captured and utilized in subsequent steps.\n", |
| 83 | + "\n", |
| 84 | + "### 2. Adaptive Retrieval Strategies\n", |
| 85 | + "\n", |
| 86 | + "- **Factual:** Enhances the query, retrieves documents, and then uses a ranking prompt to score each document's relevance.\n", |
| 87 | + "- **Analytical:** Generates multiple sub-questions to cover different aspects, retrieves documents for each, and applies a diversity prompt to select a varied set of documents.\n", |
| 88 | + "- **Opinion:** Extracts diverse viewpoints from the query, retrieves documents reflecting each viewpoint, and then ranks them to ensure representative opinions.\n", |
| 89 | + "- **Contextual:** Reformulates the query by incorporating user context, retrieves documents, and ranks them based on both relevance and contextual alignment.\n", |
| 90 | + "\n", |
| 91 | + "### 3. LLM-Enhanced Ranking\n", |
| 92 | + "\n", |
| 93 | + "- For each strategy, a dedicated ranking prompt is used to assign a score to retrieved documents, ensuring the most relevant results are prioritized.\n", |
| 94 | + "\n", |
| 95 | + "### 4. Response Generation\n", |
| 96 | + "\n", |
| 97 | + "- The top-ranked documents are then aggregated and passed into a final prompt, which instructs an LLM to generate the complete answer based on the enriched context." |
| 98 | + ] |
| 99 | + }, |
| 100 | + { |
| 101 | + "cell_type": "markdown", |
| 102 | + "metadata": {}, |
| 103 | + "source": [ |
| 104 | + "## Workflow Diagram\n", |
| 105 | + "\n", |
| 106 | + "" |
| 107 | + ] |
| 108 | + }, |
| 109 | + { |
| 110 | + "cell_type": "markdown", |
| 111 | + "metadata": {}, |
| 112 | + "source": [ |
| 113 | + "## System Advantages\n", |
| 114 | + "\n", |
| 115 | + "- **Improved Accuracy:** Tailors retrieval methods to the specific query type, reducing irrelevant or incomplete results.\n", |
| 116 | + "- **Flexibility:** Adapts to various query types (factual, analytical, opinion, contextual) ensuring that each is handled in the most effective manner.\n", |
| 117 | + "- **Context-Awareness:** Incorporates user-specific context when needed, making the system capable of generating personalized responses.\n", |
| 118 | + "- **Diverse Perspectives:** Actively retrieves and ranks documents to present a wide range of viewpoints for opinion-based queries.\n", |
| 119 | + "- **Comprehensive Analysis:** For analytical queries, the system generates sub-queries that cover different aspects of the question for a thorough analysis." |
| 120 | + ] |
| 121 | + }, |
| 122 | + { |
| 123 | + "cell_type": "markdown", |
| 124 | + "metadata": {}, |
| 125 | + "source": [ |
| 126 | + "## Practical Benefits\n", |
| 127 | + "\n", |
| 128 | + "- **More Coherent Results:** The system’s ability to enhance and reformulate queries leads to outputs that are contextually richer and more understandable.\n", |
| 129 | + "- **Reduced Fragmentation:** By using adaptive strategies, the approach avoids returning isolated or incomplete text fragments.\n", |
| 130 | + "- **Customizable Retrieval:** The strategies can be fine-tuned (e.g., adjusting the number of sub-queries or the context window) to suit different datasets and user needs.\n", |
| 131 | + "- **Enhanced Relevance:** LLM-powered ranking ensures that only the most relevant documents are selected, improving the quality of the final answer." |
| 132 | + ] |
| 133 | + }, |
| 134 | + { |
| 135 | + "cell_type": "markdown", |
| 136 | + "metadata": {}, |
| 137 | + "source": [ |
| 138 | + "## Implementation Insights\n", |
| 139 | + "\n", |
| 140 | + "- **retrieve.py:** Implements the core retrieval logic with various adaptive strategies (Factual, Analytical, Opinion, Contextual) and LLM-based ranking using multiple prompts.\n", |
| 141 | + "- **prompts.py:** Contains all the prompt templates used by the system to classify queries, enhance or reformulate them, generate sub-queries, and rank retrieved documents." |
| 142 | + ] |
| 143 | + }, |
| 144 | + { |
| 145 | + "cell_type": "markdown", |
| 146 | + "metadata": {}, |
| 147 | + "source": [ |
| 148 | + "## Parameters\n", |
| 149 | + "\n", |
| 150 | + "**ADAPTIVE_RAG_MODEL:**\n", |
| 151 | + "This environment variable specifies the language model used throughout the adaptive RAG system. It is critical for query classification, document ranking, and final answer generation, ensuring that the system utilizes the appropriate model for each task.\n", |
| 152 | + "\n", |
| 153 | + "**ADAPTIVE_RAG_QUERY_TOP_K:**\n", |
| 154 | + "This variable sets the number of top documents (top_k) to be considered during the retrieval process. It influences the breadth of the candidate documents that are ranked and ultimately selected for generating the final answer." |
| 155 | + ] |
| 156 | + }, |
| 157 | + { |
| 158 | + "cell_type": "markdown", |
| 159 | + "metadata": {}, |
| 160 | + "source": [ |
| 161 | + "### Example Prompts\n", |
| 162 | + "\n", |
| 163 | + "1. **Factual Version**\n", |
| 164 | + " - **Query:**\n", |
| 165 | + " \"What specific metrics and methodologies do scientists use to evaluate GPT 4.5's completions?\"\n", |
| 166 | + " - **Focus:** This version seeks clear, verifiable details such as evaluation metrics, experimental setups, or quantitative methods.\n", |
| 167 | + "\n", |
| 168 | + "2. **Analytical Version**\n", |
| 169 | + " - **Query:**\n", |
| 170 | + " \"How do scientists integrate quantitative metrics and qualitative assessments to comprehensively evaluate GPT 4.5's completions?\"\n", |
| 171 | + " - **Focus:** This version prompts an in-depth analysis of the evaluation process, exploring the interplay between different evaluation methods and their implications.\n", |
| 172 | + "\n", |
| 173 | + "3. **Opinion Version**\n", |
| 174 | + " - **Query:**\n", |
| 175 | + " \"What are the various perspectives among experts regarding the effectiveness and fairness of current evaluation techniques for GPT 4.5's completions?\"\n", |
| 176 | + " - **Focus:** This version targets subjective viewpoints and debates among scientists, inviting a discussion on the strengths and limitations of the evaluation methods.\n", |
| 177 | + "\n", |
| 178 | + "4. **Contextual Version**\n", |
| 179 | + " - **Query:**\n", |
| 180 | + " \"Considering the context of advancements in natural language processing and my background in AI research, how do scientists adapt their evaluation strategies for GPT 4.5's completions?\"\n", |
| 181 | + " - **Focus:** This version incorporates user-specific or situational context, encouraging answers that account for recent trends, personal expertise, or specific research contexts." |
| 182 | + ] |
| 183 | + }, |
| 184 | + { |
| 185 | + "cell_type": "markdown", |
| 186 | + "metadata": {}, |
| 187 | + "source": [ |
| 188 | + "## Conclusion\n", |
| 189 | + "\n", |
| 190 | + "The Adaptive Retrieval-Augmented Generation (RAG) system presents a significant advancement in the field of information retrieval by tailoring its strategies based on the nature of the query. Through a systematic process of query classification, adaptive retrieval, and LLM-enhanced ranking, the system effectively generates context-aware and precise responses. Its ability to dynamically adjust retrieval methods ensures that the results are not only accurate but also rich in context, thereby addressing the limitations of traditional RAG approaches. Overall, this adaptive framework exemplifies a robust and scalable solution for handling diverse and complex queries." |
| 191 | + ] |
| 192 | + } |
| 193 | + ], |
| 194 | + "metadata": { |
| 195 | + "kernelspec": { |
| 196 | + "display_name": "Python 3", |
| 197 | + "language": "python", |
| 198 | + "name": "python3" |
| 199 | + }, |
| 200 | + "language_info": { |
| 201 | + "name": "python" |
| 202 | + } |
| 203 | + }, |
| 204 | + "nbformat": 4, |
| 205 | + "nbformat_minor": 2 |
| 206 | +} |
0 commit comments