A sophisticated AI-powered agent for real-time stock options data analysis, visualization, and intelligent caching. Built with LangChain, LangGraph, and ChromaDB for enterprise-level financial data processing.
Author: Leo Ji
Version: 1.0.0
Last Updated: December 2025
- Overview
- Key Features
- Architecture
- Project Structure
- Installation
- Configuration
- Usage
- API Reference
- Development
- Evaluation & Testing
- Troubleshooting
The Financial Options Analysis Agent is an intelligent conversational AI system designed to:
- Search & Retrieve: Real-time options data from Polygon.io with smart caching
- Analyze: Professional-grade options analysis with sentiment detection and anomaly detection
- Export: Multiple export formats (CSV, Charts, Reports)
- Learn: Persistent memory across sessions with SQLite
- Scale: Microservice architecture with FastAPI integration
- Evaluate: Built-in A/B testing, skill ablation, and performance monitoring
The agent uses LangGraph for orchestration, maintains long-term conversation memory, and provides multiple tools for data analysis and visualization.
- Automatic knowledge base lookup before API calls
- Smart hybrid storage (ChromaDB + SQLite)
- Manual refresh option with
force_refresh=True - Reduces API usage and improves response time
- SQLite-based conversation history
- Multi-session continuity
- Remembers previous searches and preferences
- Survives program restarts
- Options chain analysis with Greeks
- Sentiment analysis on options positioning
- Anomaly detection using vector similarity
- Comparative analysis across multiple tickers
- Standard CSV export
- Custom CSV generation with code execution
- PNG chart visualization
- Professional reports in multiple formats
- Knowledge base integration
- Semantic search on historical data
- Date range collection
- Automatic watchlist updates
- Token usage tracking
- Tool execution metrics
- Query performance statistics
- A/B testing evaluators
- FastAPI endpoints for all tools
- Docker support
- RESTful API interface
- Easy scalability
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Interface β
β (CLI / API / Integration) β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ
β LangGraph Agent β
β βββββββββββββββ ββββββββββββ ββββββββββββββββββββ β
β β Chatbot βββββΊβ Tools βββββΊβ LLM (GPT-4o) β β
β β Node β β Node β β β β
β βββββββββββββββ ββββββββββββ ββββββββββββββββββββ β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ¬ββββββββββββββββ
β β β β
βββββββββΌβββββββββ ββββββββΌβββββββ ββββββββΌββββββ ββββββββΌβββββββ
β Tool Suite β β Memory/Stateβ β RAG KB β β Monitoring β
β - Search β β - SQLite β β -ChromaDB β β - Metrics β
β - Export β β - Session β β -SQLite β β - Tracking β
β - Analysis β β - History β β -Embeddingsβ β - A/B Test β
β - Web Search β β β β β β β
ββββββββββββββββββ βββββββββββββββ ββββββββββββββ βββββββββββββββ
β β β β
βββββββββββββββββββββ΄βββββββββββββββββ΄βββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ
β β β
βββββββββΌββββββββββ βββββββΌβββββββ ββββββββΌβββββββββ
β Polygon.io API β βFile Storage β β Microservice β
β (Options Data) β β(CSV/Charts) β β (FastAPI) β
βββββββββββββββββββ βββββββββββββββ βββββββββββββββββ
| Layer | Technology | Purpose |
|---|---|---|
| LLM Orchestration | LangGraph | Multi-agent workflow management |
| Language Model | GPT-4o (OpenAI) | Intelligent decision making |
| Vector DB | ChromaDB | Semantic similarity search |
| Relational DB | SQLite | Persistent storage |
| API Framework | FastAPI | Microservice endpoints |
| Embeddings | OpenAI Text Embedding 3-Small | Semantic encoding |
| Data Source | Polygon.io | Real-time options data |
| Search | Tavily Search | Web context retrieval |
Algovant Internship/
βββ π README.md (this file)
βββ π requirements.txt
β
βββ π€ AGENT CORE
β βββ agent_main.py # Main entry point (latest modular version)
β βββ agent_with_rules.py # Rules-based agent with external markdown rules
β
βββ βοΈ CONFIG
β βββ config/__init__.py
β βββ config/settings.py # Centralized configuration
β
βββ π§ TOOLS
β βββ tools/__init__.py
β βββ tools/code_execution.py # Code execution tool
β βββ tools/web_search.py # Web search integration
β β
β βββ search/ # Options search tools
β β βββ __init__.py
β β βββ options_search.py # Single ticker search
β β βββ batch_search.py # Batch search for multiple tickers
β β
β βββ export/ # Data export tools
β β βββ __init__.py
β β βββ csv_export.py # CSV export functionality
β β βββ visualization.py # Chart generation
β β
β βββ analysis/ # Analysis tools
β βββ __init__.py
β βββ analysis_tools.py # Professional options analysis
β
βββ π RAG (Knowledge Base)
β βββ rag/__init__.py
β βββ rag_config.py # RAG system configuration
β βββ rag_knowledge_base.py # ChromaDB + SQLite implementation
β βββ rag_tools.py # Query tools
β βββ rag_collection_tools.py # Data collection tools
β
βββ π MONITORING & EVALUATION
β βββ monitoring/
β β βββ __init__.py
β β βββ performance_monitor.py # Performance tracking
β β
β βββ evaluation/
β βββ __init__.py
β βββ ab_testing_evaluator.py # A/B testing
β βββ external_evaluator.py # External evaluations
β βββ llm_judge.py # LLM-based judge
β βββ skills_ablation.py # Skill ablation study
β
βββ π― ANALYSIS MODULES
β βββ analysis/__init__.py
β βββ options_analyzer.py # Options analysis logic
β
βββ π UTILITIES
β βββ utils/__init__.py
β βββ utils/rules_loader.py # Rule file loader
β
βββ π MICROSERVICE
β βββ microservice/
β β βββ app.py # FastAPI application
β β βββ docker-compose.yml # Docker compose config
β β βββ Dockerfile # Docker image definition
β β βββ env.template # Environment template
β β βββ requirements.txt # Microservice dependencies
β β βββ test_client.py # Testing client
β β βββ outputs/ # API output directory
β
βββ π RULES
β βββ rules/
β β βββ agent_rules.md # Core agent behaviors and workflows
β β βββ analysis_rules.md # Professional analysis rules
β
βββ π LEARNING EXAMPLES (Week 1)
β βββ Week1/
β β βββ README.md
β β βββ first_simple_openai_agent.py
β β βββ using_prebuilt.py
β β βββ add_tavily.py
β β βββ added_time_travel.py
β β βββ add_customized_state.py
β βββ week2.py
β
βββ π DATA STORAGE
β βββ data/
β β βββ chroma_db/ # Vector database (ChromaDB)
β β βββ conversation_memory.db # SQLite memory
β β βββ options.db # Options cache
β β βββ embeddings_cache/ # Embedding cache
β β βββ evaluation_*.json # Evaluation results
β
βββ π€ OUTPUT
β βββ outputs/
β β βββ csv/ # Exported CSV files
β β βββ charts/ # Generated PNG charts
β β βββ reports/ # Analysis reports
β
βββ π§ͺ TESTS & EVALUATION
β βββ run_evaluation.py # Run evaluation suite
β βββ run_ab_testing.py # Run A/B testing
β βββ run_skills_ablation.py # Run skill ablation
β βββ langraph example/ # LangGraph example project
β
βββ π οΈ UTILITIES & SCRIPTS
β βββ backup.py # Backup utility
β βββ clear_memory.py # Memory cleanup
β βββ code_examples/
β βββ csv_export_template.py # CSV export example
β
βββ π DATA FILES
βββ NVDA_options_*.csv # Sample data files
Centralized configuration management
- Environment variables
- API keys validation
- Model settings (GPT-4o selection)
- System limits (tokens, API calls)
- File paths organization
- Database connections
Complete tool suite for the agent
- search/: Options data retrieval (single and batch)
- export/: CSV export and chart visualization
- analysis/: Professional options analysis
- Additional tools: code execution, web search
Knowledge base and retrieval-augmented generation
- ChromaDB for vector similarity search
- SQLite for structured data persistence
- Collection tools for automated data gathering
- Query tools for knowledge retrieval
- Anomaly detection capabilities
Performance tracking and optimization
- Token usage metrics
- Tool execution statistics
- Query performance analysis
- Memory usage tracking
Quality assurance and testing
- A/B testing framework
- External evaluation metrics
- LLM judge for quality assessment
- Skill ablation studies
All persistent storage
- SQLite databases (memory, options, evaluation)
- ChromaDB vector store
- Embedding cache
- JSON evaluation results
Generated results and artifacts
- CSV exports in standardized format
- PNG charts and visualizations
- Analysis reports
- Python: 3.9 or higher
- pip: Package manager
- API Keys:
- OpenAI API Key (for GPT-4o)
- Polygon.io API Key (for options data)
- Tavily API Key (for web search)
- Anthropic API Key (optional)
cd /Users/leo/Desktop/CS\ projects/Algovant\ Internshippython3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txtCreate a .env file in the project root:
# Copy template
cp .env.template .env
# Edit with your keys
nano .env # or use your preferred editorFill in the .env file with your API keys:
# OpenAI API Key (required)
OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxxxxxxx
# Polygon.io API Key (required)
POLYGON_API_KEY=your_polygon_key_here
# Tavily Search API Key (optional)
TAVILY_API_KEY=your_tavily_key_here
# Anthropic API Key (optional)
ANTHROPIC_API_KEY=sk-ant-xxxxxxxxxxxxxxxxxxxxxpython agent_main.pyYou should see initialization messages confirming all components are loaded.
The system uses a centralized Settings class with multiple configuration categories:
POLYGON_API_KEY # Options data API
OPENAI_API_KEY # Language model
TAVILY_API_KEY # Web search
ANTHROPIC_API_KEY # Alternative LLMMODEL_NAME = "gpt-4o-mini" # Current model
MODEL_PROVIDER = "openai" # Provider
TEMPERATURE = 0.7 # Creativity level
JUDGE_MODEL_NAME = "gpt-4o-mini" # For evaluationMAX_MESSAGES = 20 # Conversation history
MAX_CONTEXT_TOKENS = 128000 # Token limit
SAFE_CONTEXT_TOKENS = 80000 # Conservative limit
MAX_OPTIONS_CONTRACTS = 1000 # Data limit
DEFAULT_OPTIONS_LIMIT = 100 # Default contractsCOLLECTION_NAME = "options_snapshots"
EMBEDDING_MODEL = "text-embedding-3-small"
MIN_SIMILARITY_THRESHOLD = 0.7
ANOMALY_DETECTION_ENABLED = TrueEdit config/settings.py to modify:
# Example: Change default model
class ModelConfig:
MODEL_NAME = "gpt-4" # Instead of "gpt-4o-mini"
# Example: Increase context history
class Limits:
MAX_MESSAGES = 50 # Instead of 20
# Example: Adjust RAG sensitivity
class RAGConfig:
MIN_SIMILARITY_THRESHOLD = 0.5 # More lenient matchingStart the main agent:
python agent_main.pyOr use the rules-based version:
python agent_with_rules.pyTypical interaction flow:
User: Get options for AAPL on 2025-12-19
Agent: I'll search for Apple options expiring on December 19, 2025.
How many contracts would you like? (default: 100, max: 1000)
User: 200
Agent: Found 200 options contracts. What would you like to do?
- π Export to CSV
- π Generate chart
- π¬ Show summary
- π Both CSV and chart
User: Both CSV and chart
Agent: Creating exports...
β
CSV saved: outputs/csv/AAPL_options_2025-12_20251215_143022.csv
β
Chart saved: outputs/charts/AAPL_options_2025-12.png
Start the FastAPI server:
cd microservice
python app.py
# Or using uvicorn directly
uvicorn app:app --reload --host 0.0.0.0 --port 8000API Endpoints:
# Search options
curl -X POST "http://localhost:8000/api/search" \
-H "Content-Type: application/json" \
-d '{"ticker": "AAPL", "date": "2025-12-19", "limit": 100}'
# Export CSV
curl -X POST "http://localhost:8000/api/csv" \
-H "Content-Type: application/json" \
-d '{"data": {...}, "ticker": "AAPL"}'
# Generate chart
curl -X POST "http://localhost:8000/api/chart" \
-H "Content-Type: application/json" \
-d '{"data": {...}, "ticker": "AAPL"}'from agent_main import graph, config, stream_graph_updates
# Use in your Python code
stream_graph_updates("Get options for AAPL on 2025-12-19")# Week 1 examples (basic to advanced)
cd Week1
python first_simple_openai_agent.py # Most basic
python using_prebuilt.py # With prebuilt components
python add_tavily.py # With web search
python added_time_travel.py # With memory
python add_customized_state.py # Advanced stateSearch for options data with smart caching
@tool
def search_options(
ticker: str, # Stock symbol (e.g., "AAPL")
date: str, # YYYY-MM-DD or YYYY-MM format
limit: int = 300, # Number of contracts (1-1000)
force_refresh: bool = False # Skip cache if True
) -> str: # Returns JSON with options dataExample:
result = search_options("AAPL", "2025-12-19", limit=200)Search multiple tickers efficiently
@tool
def batch_search_options(
tickers: list, # ["AAPL", "TSLA", "MSFT"]
date: str, # Same date for all tickers
limit: int = 100
) -> str: # Returns dict with results for each tickerExport options data to CSV
@tool
def make_option_table(
data: str, # JSON data from search_options
ticker: str # Stock symbol
) -> str: # Returns success message with filenameGenerate PNG chart visualization
@tool
def plot_options_chain(
data: str, # JSON data from search_options
ticker: str # Stock symbol
) -> str: # Returns success message with filenameProfessional options analysis
@tool
def analyze_options_chain(
ticker: str, # Stock symbol (must be first!)
options_data: str # JSON data
) -> str: # Returns detailed analysis reportCollect and immediately store in knowledge base
@tool
def collect_and_store_options(
ticker: str, # Stock symbol
date: str, # YYYY-MM-DD or YYYY-MM
limit: int # Number of contracts
) -> str: # Returns storage confirmationQuery the knowledge base
@tool
def search_knowledge_base(
query: str, # Natural language query
limit: int = 5 # Max results
) -> str: # Returns matching historical dataFind unusual changes in options data
@tool
def detect_anomaly(
ticker: str, # Stock symbol
current_data: str # Current options data
) -> str: # Returns anomaly reportExecute custom Python code
@tool
def code_execution_tool(code: str) -> str:
"""Execute custom Python code for advanced analysis"""Get system performance metrics
@tool
def get_performance_stats(
mode: str = "current" # "current", "summary", or "history"
) -> str: # Returns performance metricsRequest human input
@tool
def human_assistance(question: str) -> str:
"""Ask for human intervention when needed"""- Modularity: Each tool and component is independent
- Configurability: Settings centralized in
config/settings.py - Extensibility: Easy to add new tools and rules
- Maintainability: Clear separation of concerns
- Observability: Built-in monitoring and logging
Step 1: Create a new file in tools/:
# tools/my_new_tool.py
from langchain_core.tools import tool
@tool
def my_new_tool(parameter: str) -> str:
"""
Tool description for the LLM.
Args:
parameter: What this parameter does
Returns:
What the tool returns
"""
# Implementation
return "Result"Step 2: Import and register in agent_main.py:
from tools.my_new_tool import my_new_tool
tools = [
# ... existing tools ...
my_new_tool, # Add here
]Step 1: Edit rules/agent_rules.md or create new markdown file:
## π― New Capability: My New Skill
### Description
What this skill does...
### Workflow
1. First step
2. Second step
3. Third step
### Tools Used
- tool_name_1
- tool_name_2Step 2: Load rules in agent:
# In agent_with_rules.py
agent_rules = load_agent_rules("agent_rules.md")
# New rules automatically includedAdd new configuration class in config/settings.py:
class MyNewConfig:
"""Configuration for my new feature"""
SETTING_1 = "value"
SETTING_2 = 100
# Add to Settings class
class Settings:
my_feature = MyNewConfig
# Export
MY_FEATURE_CONFIG = settings.my_featureCheck database contents:
# SQLite databases
sqlite3 data/conversation_memory.db ".tables"
sqlite3 data/options.db ".schema"
# ChromaDB (Python)
python
>>> from rag.rag_knowledge_base import client
>>> collections = client.list_collections()Backup data:
python backup.pyClear conversation memory:
python clear_memory.pyA/B Testing:
python run_ab_testing.pyCompares performance of different agent configurations or prompts.
Skill Ablation Study:
python run_skills_ablation.pyTests the agent's performance with different tool subsets to identify critical skills.
External Evaluation:
python run_evaluation.pyUses external datasets and metrics for comprehensive evaluation.
In-Agent Performance Stats:
# Ask the agent
User: What are my performance statistics?
Agent: [Returns token usage, tool execution metrics, query performance]View Evaluation Results:
# Check evaluation JSON files
cat data/evaluation_*.json | python -m json.toolError: Missing required API keys: POLYGON_API_KEY, OPENAI_API_KEY
Solution:
- Verify
.envfile exists in project root - Check API keys are correct
- Run:
python -c "from config.settings import settings; settings.initialize()"
Error: Failed to connect to ChromaDB
Solution:
# Clear and reinitialize
rm -rf data/chroma_db
python agent_main.py # Will recreateError: database is locked
Solution:
# Close other connections and clear locks
rm -f data/conversation_memory.db-*
python agent_main.pyError: This model's maximum context length is...
Solution:
Adjust MAX_MESSAGES in config/settings.py:
class Limits:
MAX_MESSAGES = 10 # Reduce from default 20Error: API rate limit exceeded
Solution:
- Use caching with
force_refresh=False(default) - Increase delay between requests
- Upgrade Polygon.io API plan
Error: Address already in use (:8000)
Solution:
# Kill process on port 8000
lsof -ti:8000 | xargs kill -9
# Or use different port
uvicorn microservice.app:app --port 8001Enable verbose logging:
# In config/settings.py
class AgentConfig:
DEBUG = True
VERBOSE = TrueOr run with Python debugging:
python -m pdb agent_main.pyReduce Context Length:
# config/settings.py
MAX_MESSAGES = 10 # Default 20Disable Monitoring (for speed):
ENABLE_PERFORMANCE_MONITORING = FalseUse Smaller Model:
MODEL_NAME = "gpt-4o-mini" # Faster and cheaper than gpt-4Increase Cache Hit Rate:
# RAG tuning
MIN_SIMILARITY_THRESHOLD = 0.5 # More lenient matching-
For Beginners: Start with
Week1/examples- Basic LangGraph concepts
- State management
- Tool integration
-
For Intermediate: Study
agent_main.py- Modular architecture
- Tool assembly
- Memory management
-
For Advanced: Explore evaluation framework
- A/B testing
- Skill ablation
- Performance metrics
- Agent Rules:
rules/agent_rules.md- Core behaviors - Analysis Rules:
rules/analysis_rules.md- Analysis methodology - Week 1 Examples:
Week1/README.md- Learning materials
- LangChain Documentation
- LangGraph Documentation
- ChromaDB Documentation
- OpenAI API Reference
- Polygon.io API Docs
- API Keys: Never commit
.envfile or API keys to version control - Database Access: SQLite files contain conversation history (sensitive data)
- Rate Limiting: Implement rate limiting in production
- Input Validation: Sanitize user inputs before processing
- HTTPS: Use HTTPS for microservice API in production
| Metric | Typical Value |
|---|---|
| Search Response (cached) | < 500ms |
| Search Response (API call) | 1-3 seconds |
| CSV Export | < 1 second |
| Chart Generation | 2-5 seconds |
| Analysis Report | 5-10 seconds |
| Token Usage (per query) | 500-2000 tokens |
Project: Financial Options Analysis Agent
Author: Leo Ji
Organization: Algovant Internship
Date: December 2025
Contributions are welcome! Areas for enhancement:
- Additional analysis indicators (Greeks, IV rank, etc.)
- More visualization types (heatmaps, Greeks profiles)
- Real-time WebSocket support
- Database query optimization
- Additional evaluation metrics
- Deployment automation
- Multi-language support
- Mobile app integration
For issues or questions:
- Check the Troubleshooting section
- Review relevant documentation files
- Check Week1 examples for learning resources
- Review evaluation results for performance insights
Last Updated: December 15, 2025
Version: 1.0.0
Status: Production Ready β