GitSurfer

GitSurfer is an intelligent, multi-provider codebase analysis and research assistant for GitHub repositories. It leverages advanced LLMs (Gemini, OpenAI, Anthropic, Cohere) and vector databases to dynamically fetch, summarize, embed, and answer questions about any public GitHub repository, providing deep insights and research capabilities for developers and researchers.

Features

Fetches and analyzes GitHub repositories (tree structure, file contents)
Summarizes repository structure using LLMs
Embeds code and documentation into a vector database (ChromaDB)
Supports multiple LLM and embedding providers: Gemini, OpenAI, Anthropic, Cohere
Interactive research assistant: Ask questions about the codebase and get detailed, contextual answers
Extensible modular architecture using LangGraph and LangChain
Rich logging and error handling

Graphs

Embedder sub-Graph

Fetcher sub-Graph

GitSurfer main Graph

Project Structure

GitSurfer/
├── app/
│   ├── core/           # Core utilities, LLM/embedding logic
│   ├── graphs/         # Main assistant, fetcher, embedder, researcher graphs
│   ├── retriever/      # Data ingestion and retriever logic
├── config/             # Settings and environment variable loader
├── DATA/               # Persisted vector DBs
├── temp/               # Temporary files (chunks, summaries)
├── logs/               # Log files
├── logger.py           # Logging configuration
├── requirements.txt    # Python dependencies
├── .env                # Environment variables (not committed)

Installation

Clone the repository

git clone <your-fork-or-repo-url>
cd GitSurfer

Install dependencies
```
pip install -r requirements.txt
```
Set up environment variables
- Copy .env.example to .env and fill in your API keys:
  - GOOGLE_API_KEY (for Gemini)
  - OPENAI_API_KEY (for OpenAI)
  - ANTHROPIC_API_KEY (for Anthropic)
  - COHERE_API_KEY (for Cohere)
  - GITHUB_TOKEN (for increased GitHub API limits)
- You can also specify model names and other settings in .env.

Usage

The main entry point is the app/graphs/git_assistant.py script. It runs an interactive CLI assistant:

python app/graphs/git_assistant.py

Workflow:

Enter a GitHub repository URL when prompted.
GitSurfer fetches the repo, summarizes its structure, and creates a vector DB.
Ask any question about the codebase (design, functions, usage, etc.).
Interactively continue the research session or exit.

Example:

🔄 Processing repository...
👤 Input required: Enter GitHub repo URL
🤖 Assistant: Repository fetched and analyzed. Ask your question!
👤 You: What does the main.py file do?
🤖 Assistant: [detailed answer]

Configuration

All settings (provider selection, model names, directories) are managed in config/settings.py and via environment variables.
Supports switching between providers for both LLM and embeddings.
Vector DBs are persisted under DATA/.

Prerequisites

Python 3.9+
API keys for at least one supported LLM/embedding provider
(Optional) GitHub Personal Access Token for higher API rate limits

Environment Variables

Variable	Description
GOOGLE_API_KEY	Gemini API key
OPENAI_API_KEY	OpenAI API key
ANTHROPIC_API_KEY	Anthropic API key
COHERE_API_KEY	Cohere API key
GITHUB_TOKEN	GitHub token for API calls
GEMINI_LLM_MODEL	Gemini model name (default set)
OPENAI_LLM_MODEL	OpenAI model name (default set)
...	See `config/settings.py` for all

Testing

Run tests using:

pytest

Credits

Author: Lalan Kumar (kumar8074)
Built with LangChain, LangGraph, and ChromaDB

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
DATA		DATA
app		app
config		config
.gitignore		.gitignore
README.md		README.md
langgraph.json		langgraph.json
logger.py		logger.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GitSurfer

Features

Graphs

Project Structure

Installation

Usage

Configuration

Prerequisites

Environment Variables

Testing

Credits

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

kumar8074/GitSurfer

Folders and files

Latest commit

History

Repository files navigation

GitSurfer

Features

Graphs

Project Structure

Installation

Usage

Configuration

Prerequisites

Environment Variables

Testing

Credits

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages