Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions python/agents/RAG/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# --- RAG Agent Garden Hooks ---
# These targets inject interactive corpus setup into the Agent Starter Pack lifecycle

.PHONY: setup_corpus grant_permissions

# We declare `setup_corpus` as a prerequisite to the standard `install` target.
# When `make install` is run (by the user or Agent Garden), it executes this first.
install: setup_corpus

# We declare `grant_permissions` to entirely override or pre-empt the standard deployment
# so we can securely inject the env var.
backend: grant_permissions

setup_corpus:
@echo "==============================================================================="
@echo "| RAG Corpus Setup |"
@echo "==============================================================================="
@touch .env
@if ! grep -q "^GOOGLE_CLOUD_PROJECT=" .env; then \
PROJECT_ID=$$(gcloud config get-value project); \
echo "GOOGLE_CLOUD_PROJECT=$$PROJECT_ID" >> .env; \
fi
@if ! grep -q "^GOOGLE_CLOUD_LOCATION=" .env; then \
echo "GOOGLE_CLOUD_LOCATION=us-central1" >> .env; \
fi
@read -p "Do you want to deploy the sample RAG corpus for this agent? (y/n) " deploy_corpus; \
if [ "$$deploy_corpus" = "y" ] || [ "$$deploy_corpus" = "Y" ]; then \
echo "Deploying RAG corpus..."; \
if [ -f "rag/shared_libraries/prepare_corpus_and_data.py" ]; then \
uv run python rag/shared_libraries/prepare_corpus_and_data.py; \
else \
echo "Warning: prepare_corpus_and_data.py not found in scaffolded project."; \
fi \
else \
echo "Skipping RAG corpus deployment."; \
fi

grant_permissions:
@echo "==============================================================================="
@echo "| Deploying Agent Engine with RAG Corpus & Granting Permissions |"
@echo "==============================================================================="
@if grep -q "^RAG_CORPUS=" .env; then \
CORPUS_VAL=$$(grep "^RAG_CORPUS=" .env | cut -d '=' -f2- | tr -d "'\""); \
echo "Injecting RAG_CORPUS=$$CORPUS_VAL via --set-env-vars..."; \
(uv export --no-hashes --no-header --no-dev --no-emit-project --no-annotate > rag/app_utils/.requirements.txt 2>/dev/null || \
uv export --no-hashes --no-header --no-dev --no-emit-project > rag/app_utils/.requirements.txt) && \
uv run -m rag.app_utils.deploy \
--source-packages=./rag \
--entrypoint-module=rag.agent_engine_app \
--entrypoint-object=agent_engine \
--requirements-file=rag/app_utils/.requirements.txt \
--set-env-vars="RAG_CORPUS=$$CORPUS_VAL" \
$(if $(AGENT_IDENTITY),--agent-identity) \
$(if $(filter command line,$(origin SECRETS)),--set-secrets="$(SECRETS)"); \
if [ -f "rag/shared_libraries/grant_permissions.sh" ]; then \
bash rag/shared_libraries/grant_permissions.sh; \
else \
echo "Warning: rag/shared_libraries/grant_permissions.sh not found."; \
fi \
else \
echo "Skipping custom deploy and IAM grant: RAG_CORPUS not found in .env. Falling back to standard deploy..."; \
fi
46 changes: 20 additions & 26 deletions python/agents/RAG/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ This diagram outlines the agent's workflow, designed to provide informed and con

#### How to upload my file to my RAG corpus

The `rag/shared_libraries/prepare_corpus_and_data.py` script helps you set up a RAG corpus and upload an initial document. By default, it downloads Alphabet's 2024 10-K PDF and uploads it to a new corpus.
The `rag/shared_libraries/prepare_corpus_and_data.py` script helps you set up a RAG corpus and upload an initial document. By default, it downloads Alphabet's 2025 10-K PDF and uploads it to a new corpus.

1. **Authenticate with your Google Cloud account:**
```bash
Expand All @@ -93,7 +93,7 @@ The `rag/shared_libraries/prepare_corpus_and_data.py` script helps you set up a
```bash
uv run python rag/shared_libraries/prepare_corpus_and_data.py
```
This will create a corpus named `Alphabet_10K_2024_corpus` (if it doesn't exist) and upload the PDF `goog-10-k-2024.pdf` downloaded from the URL specified in the script.
This will create a corpus named `Alphabet_10K_2025_corpus` (if it doesn't exist) and upload the PDF `goog-10-k-2025.pdf` downloaded from the URL specified in the script.

* **To upload a different PDF from a URL:**
a. Open the `rag/shared_libraries/prepare_corpus_and_data.py` file.
Expand Down Expand Up @@ -170,13 +170,13 @@ Here's a quick example of how a user might interact with the agent:

**Example 1: Document Information Retrieval**

User: What are the key business segments mentioned in Alphabet's 2024 10-K report?
User: What are the key business segments mentioned in Alphabet's 2025 10-K report?

Agent: According to Alphabet's 2024 10-K report, the key business segments are:
Agent: According to Alphabet's 2025 10-K report, the key business segments are:
1. Google Services (including Google Search, YouTube, Google Maps, Play Store)
2. Google Cloud (offering cloud computing services, data analytics, and AI solutions)
3. Other Bets (including Waymo for autonomous driving technology)
[Source: goog-10-k-2024.pdf]
[Source: goog-10-k-2025.pdf]

## Evaluating the Agent

Expand All @@ -195,7 +195,7 @@ The evaluation framework consists of three key components:
1. **test_eval.py**: The main test script that orchestrates the evaluation process. It uses the `AgentEvaluator` from Google ADK to run the agent against a test dataset and assess its performance based on predefined criteria.

2. **conversation.test.json**: Contains a sequence of test cases structured as a conversation. Each test case includes:
- A user query (e.g., questions about Alphabet's 10-K report)
- A user query (e.g., questions about Alphabet's 2025 10-K report)
- Expected tool usage (which tools the agent should call and with what parameters)
- Reference answers (ideal responses the agent should provide)

Expand Down Expand Up @@ -250,8 +250,8 @@ After deploying the agent, follow these steps to test it:
```
- Run the permissions script:
```bash
chmod +x deployment/grant_permissions.sh
./deployment/grant_permissions.sh
chmod +x rag/shared_libraries/grant_permissions.sh
./rag/shared_libraries/grant_permissions.sh
```
This script will:
- Read the environment variables from your `.env` file
Expand All @@ -268,33 +268,27 @@ After deploying the agent, follow these steps to test it:
- Send a series of test queries
- Display the agent's responses with proper formatting

The test script includes example queries about Alphabet's 10-K report. You can modify the queries in `deployment/run.py` to test different aspects of your deployed agent.
The test script includes example queries about Alphabet's 2025 10-K report. You can modify the queries in `deployment/run.py` to test different aspects of your deployed agent.

### Alternative: Using Agent Starter Pack
### Recommended: Using Agent Starter Pack

You can also use the [Agent Starter Pack](https://goo.gle/agent-starter-pack) to create a production-ready version of this agent with additional deployment options:
The Agent Starter Pack is the recommended way to create and deploy a production-ready version of this agent. We have built custom lifecycle hooks into this template so that the Agent Starter Pack automatically handles building your RAG corpus and granting IAM permissions during deployment.

To create your project using `uv`:
```bash
# Create and activate a virtual environment
python -m venv .venv && source .venv/bin/activate # On Windows: .venv\Scripts\activate

# Install the starter pack and create your project
pip install --upgrade agent-starter-pack
agent-starter-pack create my-rag-agent -a adk@rag
uvx agent-starter-pack create my-rag-agent -a adk@RAG -d agent_engine -ds vertex_ai_search
cd my-rag-agent
```

<details>
<summary>⚡️ Alternative: Using uv</summary>

If you have [`uv`](https://github.com/astral-sh/uv) installed, you can create and set up your project with a single command:
Next, run the installation command. This will prompt you to automatically build the sample RAG Corpus and configure your `.env` file:
```bash
uvx agent-starter-pack create my-rag-agent -a adk@rag
make install
```
This command handles creating the project without needing to pre-install the package into a virtual environment.

</details>

The starter pack will prompt you to select deployment options and provides additional production-ready features including automated CI/CD deployment scripts.
Finally, deploy the agent to Google Cloud. This will package your agent, push it to Vertex AI Agent Engine, and automatically grant the new Agent Identity permissions to query your RAG Corpus:
```bash
make backend
```

## Customization

Expand Down
51 changes: 31 additions & 20 deletions python/agents/RAG/rag/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
import os
import uuid

import google.auth
from dotenv import load_dotenv
from google.adk.agents import Agent
from google.adk.tools.retrieval.vertex_ai_rag_retrieval import (
Expand All @@ -28,32 +29,42 @@
from .prompts import return_instructions_root

load_dotenv()

_, project_id = google.auth.default()
os.environ.setdefault("GOOGLE_CLOUD_PROJECT", project_id)
os.environ["GOOGLE_CLOUD_LOCATION"] = "global"
os.environ.setdefault("GOOGLE_GENAI_USE_VERTEXAI", "True")

_ = instrument_adk_with_arize()

# Initialize tools list
tools = []

ask_vertex_retrieval = VertexAiRagRetrieval(
name="retrieve_rag_documentation",
description=(
"Use this tool to retrieve documentation and reference materials for the question from the RAG corpus,"
),
rag_resources=[
rag.RagResource(
# please fill in your own rag corpus
# here is a sample rag corpus for testing purpose
# e.g. projects/123/locations/us-central1/ragCorpora/456
rag_corpus=os.environ.get("RAG_CORPUS")
)
],
similarity_top_k=10,
vector_distance_threshold=0.6,
)
# Only add RAG retrieval tool if RAG_CORPUS is configured
rag_corpus = os.environ.get("RAG_CORPUS")
if rag_corpus:
ask_vertex_retrieval = VertexAiRagRetrieval(
name="retrieve_rag_documentation",
description=(
"Use this tool to retrieve documentation and reference materials for the question from the RAG corpus,"
),
rag_resources=[
rag.RagResource(
# please fill in your own rag corpus
# here is a sample rag corpus for testing purpose
# e.g. projects/123/locations/us-central1/ragCorpora/456
rag_corpus=rag_corpus
)
],
similarity_top_k=10,
vector_distance_threshold=0.6,
)
tools.append(ask_vertex_retrieval)

with using_session(session_id=uuid.uuid4()):
root_agent = Agent(
model="gemini-2.0-flash-001",
model="gemini-2.5-flash",
name="ask_rag_agent",
instruction=return_instructions_root(),
tools=[
ask_vertex_retrieval,
],
tools=tools,
)
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,19 @@ set -e

# Load environment variables from .env file
SCRIPT_DIR="$(dirname "$0")"
ENV_FILE="${SCRIPT_DIR}/../.env"
if [ -f "$ENV_FILE" ]; then
source "$ENV_FILE"

# Prioritize .env in the current working directory (e.g., when run inside a scaffolded project)
if [ -f "$PWD/.env" ]; then
ENV_FILE="$PWD/.env"
elif [ -f "${SCRIPT_DIR}/../.env" ]; then
ENV_FILE="${SCRIPT_DIR}/../.env"
else
echo "Error: .env file not found at $ENV_FILE"
echo "Error: .env file not found"
exit 1
fi

source "$ENV_FILE"

# Get the project ID from environment variable
PROJECT_ID="$GOOGLE_CLOUD_PROJECT"
if [ -z "$PROJECT_ID" ]; then
Expand Down
26 changes: 15 additions & 11 deletions python/agents/RAG/rag/shared_libraries/prepare_corpus_and_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,18 @@
from google.auth import default
from vertexai.preview import rag

# --- Please fill in your configurations ---
# Load environment variables from .env file
load_dotenv()
# Try finding .env in the current working directory first (e.g. scaffolded project)
cwd_env = os.path.join(os.getcwd(), ".env")
if os.path.exists(cwd_env):
ENV_FILE_PATH = cwd_env
else:
ENV_FILE_PATH = os.path.abspath(
os.path.join(os.path.dirname(__file__), "..", "..", ".env")
)
load_dotenv(ENV_FILE_PATH)

# --- Please fill in your configurations ---
# Retrieve the PROJECT_ID from the environmental variables.
PROJECT_ID = os.getenv("GOOGLE_CLOUD_PROJECT")
if not PROJECT_ID:
Expand All @@ -37,14 +45,10 @@
raise ValueError(
"GOOGLE_CLOUD_LOCATION environment variable not set. Please set it in your .env file."
)
CORPUS_DISPLAY_NAME = "Alphabet_10K_2024_corpus"
CORPUS_DESCRIPTION = "Corpus containing Alphabet's 10-K 2024 document"
PDF_URL = "https://abc.xyz/assets/77/51/9841ad5c4fbe85b4440c47a4df8d/goog-10-k-2024.pdf"
PDF_FILENAME = "goog-10-k-2024.pdf"
ENV_FILE_PATH = os.path.abspath(
os.path.join(os.path.dirname(__file__), "..", "..", ".env")
)

CORPUS_DISPLAY_NAME = "Alphabet_10K_2025_corpus"
CORPUS_DESCRIPTION = "Corpus containing Alphabet's 10-K 2025 document"
PDF_URL = "https://s206.q4cdn.com/479360582/files/doc_financials/2025/q4/GOOG-10-K-2025.pdf"
PDF_FILENAME = "goog-10-k-2025.pdf"

# --- Start of the script ---
def initialize_vertex_ai():
Expand Down Expand Up @@ -155,7 +159,7 @@ def main():
corpus_name=corpus.name,
pdf_path=pdf_path,
display_name=PDF_FILENAME,
description="Alphabet's 10-K 2024 document",
description="Alphabet's 10-K 2025 document",
)

# List all files in the corpus
Expand Down
Loading