An intelligent AI-powered tool that analyzes GitHub repositories, generates comprehensive summaries, and answers questions about codebases using advanced LLM technology.
🔹 Repository Analysis: Clone and analyze any public GitHub repository
🔹 Code Structure Parsing: Extract files, folders, classes, functions, and variables
🔹 AI-Powered Summaries: Generate detailed summaries using Mixtral LLM
🔹 Interactive Q&A: Ask natural language questions about the codebase
🔹 Smart Filtering: Automatically skip binary files and irrelevant directories
🔹 Visual Reports: Beautiful Streamlit interface with structured outputs
- Frontend: Streamlit
- Backend: Python
- LLM: OpenRouter API (Mixtral-8x7B-Instruct)
- Git Operations: GitPython
- Code Parsing: AST + Regex patterns
- Quick start with Docker Compose
git clone <your-repo-url>
cd ai-github-code-analyzer
cp .env.example .env
# Edit .env file and add your OPENROUTER_API_KEY
docker-compose up -d- Access the application
Open your browser and go to
http://localhost:8501
📖 For detailed Docker instructions, see DOCKER.md
- Clone this repository
git clone <your-repo-url>
cd ai-github-code-analyzer- Install dependencies
pip install -r requirements.txt- Set up environment variables
Create a
.envfile in the root directory:
OPENROUTER_API_KEY=your_openrouter_api_key_here
Get your free API key from OpenRouter
- Run the application
streamlit run app.py- Enter GitHub Repository URL: Paste any public GitHub repository URL
- Click Analyze: The tool will clone and analyze the repository
- View Summaries: Get comprehensive file and function summaries
- Ask Questions: Use the Q&A section to query the codebase
Input: https://github.com/microsoft/vscode
Output:
- Repository structure analysis
- File-by-file summaries
- Function and class explanations
- Overall project summary
- Interactive Q&A capabilities
ai-github-code-analyzer/
├── app.py # Streamlit main application
├── requirements.txt # Python dependencies
├── README.md # This file
├── .gitignore # Git ignore patterns
├── repos/ # Temporary cloned repositories
├── utils/ # Core utilities
│ ├── github_utils.py # GitHub operations
│ ├── parser_utils.py # Code parsing logic
│ ├── llm_utils.py # LLM API integration
│ ├── summarizer.py # Summary generation
│ └── qa_agent.py # Q&A functionality
├── assets/ # Static assets
└── sample_output/ # Example outputs
The tool supports various configuration options:
- Max files: Limit number of files to analyze
- File size limits: Skip large files
- Language filters: Focus on specific programming languages
- Directory exclusions: Skip node_modules, .git, etc.
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
MIT License - feel free to use and modify as needed.
For issues or questions, please open a GitHub issue or reach out to me directly sourjya.mukherji@gmail.com.