Your personal, privacy-first AI voice assistant that actually works offline.
Vaani (वाणी - "voice" in Hindi) is a proprietary voice assistant I built to solve my frustrations with existing options. It runs primarily on your machine, respects your privacy, speaks 32 languages, and you can customize everything.
# Quick Start
git clone https://github.com/paman7647/vaani.git
cd vaani && chmod +x setup.sh && ./setup.sh
python3 main.py
# Say "Hey Vaani" and start talking!🔒 Privacy-Focused → Most processing happens locally. Your voice stays on your device.
🗣️ 32 Languages → English, Hindi, Spanish, French, and 28 more. Switch anytime.
🎵 Music Ready → Stream from YouTube with voice control (pause, skip, volume).
🧠 Actually Smart → Uses Google Gemini AI for natural conversations when you need it.
⚡ Fast & Reliable → 3-engine recognition system (Google API → Vosk → Sphinx) with automatic fallback.
- 🎤 Voice Control: Wake word detection, natural conversation flow
- 💬 Smart Conversations: Context-aware responses using AI
- 🔍 Web Search: Real-time information from DuckDuckGo
- 🎶 Music Playback: YouTube streaming with voice controls
- 🌍 Multi-Language: Switch between 32 languages seamlessly
- 🏠 Runs Locally: No cloud dependency for core features
- Python 3.10+
- macOS, Linux, or Windows 10/11
- Microphone and speakers
- Internet (optional - works offline too!)
macOS / Linux:
git clone https://github.com/paman7647/vaani.git
cd vaani
chmod +x setup.sh && ./setup.shWindows (PowerShell as Admin):
git clone https://github.com/paman7647/vaani.git
cd vaani
.\setup.ps1Create a .env file for AI features:
# Get free key from makersuite.google.com
GEMINI_API_KEY=your_api_key_herepython3 main.py
# Say "Hey Vaani" and start talking!Complete documentation is available at: vaani.readthedocs.io
- 📖 Getting Started Guide
- 🏗️ Architecture Overview
- ⚙️ Configuration Reference
- 🔧 Troubleshooting
- 💻 Developer Guide
# Start Vaani
python3 main.py
# Then try:
"Hey Vaani, play some jazz music"
"What's the weather today?"
"Tell me about quantum computing"
"Pause the music"
"What time is it?"
"Tell me a joke"Tip: It understands context! You can have natural conversations without repeating "Hey Vaani" every time.
| Component | Technology |
|---|---|
| AI Brain | Google Gemini 2.5 Flash |
| Speech Recognition | Vosk + Google Speech API + Sphinx |
| Text-to-Speech | pyttsx3 + Edge TTS |
| Music Streaming | VLC + yt-dlp |
| Web Search | DuckDuckGo API |
| Audio Processing | PyAudio + sounddevice |
- Multi-engine speech recognition
- YouTube music playback
- Web search integration
- GUI interface
- Smart home integration (HomeAssistant)
- Voice profiles for multiple users
- Long-term memory persistence
- Mobile app
Quick Fixes:
- Microphone not working? → Check system permissions and
ENERGY_THRESHOLDinconfig.json - Vosk model missing? → Run
./setup.shagain - Music won't play? → Install VLC (
brew install vlcorapt install vlc)
Need more help? Check the full troubleshooting guide.
Contributions are welcome! Whether you want to:
- 🐛 Fix bugs
- ✨ Add features
- 📝 Improve docs
- 🧪 Test on different platforms
Check out the Contributing Guide to get started.
MIT License - See LICENSE for details.
Free to use, modify, and distribute. No restrictions!
Aman Kumar Pandey
Built this as a personal project to learn about AI and voice technology. Feel free to reach out!
- 🐙 GitHub: @paman7647
- 📚 Docs: vaani.readthedocs.io
If you find Vaani useful:
- ⭐ Star this repo
- 🐛 Report bugs
- 💡 Share ideas
- 🔀 Submit pull requests
Made with ❤️ by Aman Kumar Pandey