Welcome to Agentic RAG, an intelligent retrieval system that doesnβt just fetch dataβit thinks before retrieving. Unlike traditional RAG, which passively pulls from a vector database, Agentic RAG dynamically decides whether to:
β
Retrieve stored knowledge (Qdrant)
β
Perform a real-time web search (Tavily)
This prevents outdated responses, reduces hallucinations, and improves efficiency for LLM-powered applications.
- π Hybrid Retrieval β Chooses between stored embeddings & live web search dynamically.
- π§ Decision-Making AI Agent β Knows when to use Qdrant vs. search the web.
- β‘ Faster & Smarter β Reduces unnecessary API calls, optimizing response time & accuracy.
- π οΈ Prevents AI Hallucinations β Avoids guessing by fetching fresh data when needed.
- π Hands-on Colab Notebook β Try it out without setup hassles.
pip install --upgrade langchain langchain_community pypdf langchain-huggingface google-generativeai qdrant-clientgit clone https://github.com/devjothish/AgenticRAG.git
cd AgenticRAGTo run Agentic RAG, you need API keys for Qdrant (vector search) and Tavily (web search).
import os
os.environ["QDRANT_API_KEY"] = "your-qdrant-api-key"
os.environ["TAVILY_API_KEY"] = "your-tavily-api-key"Qdrant stores documents in vector format for fast similarity-based retrieval.
from qdrant_client import QdrantClient
qdrant = QdrantClient(
url="https://your-qdrant-instance.com",
api_key=os.getenv("QDRANT_API_KEY")
)The agent chooses between vector search (Qdrant) and live web search (Tavily).
from langchain.tools import Tool
# Qdrant Vector Search Tool
vector_search_tool = Tool(
name="VectorStore",
func=lambda query: vectorstore.similarity_search(query, k=3),
description="Use this tool when the query is related to stored documents."
)
# Tavily Web Search Tool
web_search_tool = Tool(
name="WebSearch",
func=web_search,
description="Use this tool when the query requires fresh or real-time information."
)
tools = [vector_search_tool, web_search_tool]The agent decides dynamically whether to retrieve from Qdrant or use Tavily for fresh information.
from langchain.agents import AgentExecutor
agent_executor = AgentExecutor(
agent=chain, # The LLM-powered decision-maker
tools=tools, # Vector search & web search tools
handle_parsing_errors=True,
verbose=True
)query1 = "What is Agentic RAG?"
response1 = agent_executor.invoke({"input": query1})
print(response1)query2 = "Who won the latest NBA game?"
response2 = agent_executor.invoke({"input": query2})
print(response2)πΉ If the query matches stored knowledge β The agent retrieves from Qdrant.
πΉ If the query requires real-time data β The agent searches the web using Tavily.
π’ Try the hands-on Colab Notebook β Click Here
π¨βπ» Jothiswaran Arumugam
π GitHub: github.com/devjothish
π¬ Twitter: @devjothish