Skip to content

paman7647/Vaani

Repository files navigation

Vaani 🎙️

Python Platform License Documentation

Your personal, privacy-first AI voice assistant that actually works offline.

Vaani (वाणी - "voice" in Hindi) is a proprietary voice assistant I built to solve my frustrations with existing options. It runs primarily on your machine, respects your privacy, speaks 32 languages, and you can customize everything.

# Quick Start
git clone https://github.com/paman7647/vaani.git
cd vaani && chmod +x setup.sh && ./setup.sh
python3 main.py
# Say "Hey Vaani" and start talking!

Why Vaani?

🔒 Privacy-Focused → Most processing happens locally. Your voice stays on your device.
🗣️ 32 Languages → English, Hindi, Spanish, French, and 28 more. Switch anytime.
🎵 Music Ready → Stream from YouTube with voice control (pause, skip, volume).
🧠 Actually Smart → Uses Google Gemini AI for natural conversations when you need it.
⚡ Fast & Reliable → 3-engine recognition system (Google API → Vosk → Sphinx) with automatic fallback.


What Can It Do?

  • 🎤 Voice Control: Wake word detection, natural conversation flow
  • 💬 Smart Conversations: Context-aware responses using AI
  • 🔍 Web Search: Real-time information from DuckDuckGo
  • 🎶 Music Playback: YouTube streaming with voice controls
  • 🌍 Multi-Language: Switch between 32 languages seamlessly
  • 🏠 Runs Locally: No cloud dependency for core features

🚀 Quick Setup

Requirements

  • Python 3.10+
  • macOS, Linux, or Windows 10/11
  • Microphone and speakers
  • Internet (optional - works offline too!)

Installation

macOS / Linux:

git clone https://github.com/paman7647/vaani.git
cd vaani
chmod +x setup.sh && ./setup.sh

Windows (PowerShell as Admin):

git clone https://github.com/paman7647/vaani.git
cd vaani
.\setup.ps1

Configuration (Optional)

Create a .env file for AI features:

# Get free key from makersuite.google.com
GEMINI_API_KEY=your_api_key_here

Run It

python3 main.py
# Say "Hey Vaani" and start talking!

📚 Documentation

Complete documentation is available at: vaani.readthedocs.io


💡 Usage Examples

# Start Vaani
python3 main.py

# Then try:
"Hey Vaani, play some jazz music"
"What's the weather today?"
"Tell me about quantum computing"
"Pause the music"
"What time is it?"
"Tell me a joke"

Tip: It understands context! You can have natural conversations without repeating "Hey Vaani" every time.


🛠️ Tech Stack

Component Technology
AI Brain Google Gemini 2.5 Flash
Speech Recognition Vosk + Google Speech API + Sphinx
Text-to-Speech pyttsx3 + Edge TTS
Music Streaming VLC + yt-dlp
Web Search DuckDuckGo API
Audio Processing PyAudio + sounddevice

🗺️ Roadmap

  • Multi-engine speech recognition
  • YouTube music playback
  • Web search integration
  • GUI interface
  • Smart home integration (HomeAssistant)
  • Voice profiles for multiple users
  • Long-term memory persistence
  • Mobile app

🐛 Troubleshooting

Quick Fixes:

  • Microphone not working? → Check system permissions and ENERGY_THRESHOLD in config.json
  • Vosk model missing? → Run ./setup.sh again
  • Music won't play? → Install VLC (brew install vlc or apt install vlc)

Need more help? Check the full troubleshooting guide.


🤝 Contributing

Contributions are welcome! Whether you want to:

  • 🐛 Fix bugs
  • ✨ Add features
  • 📝 Improve docs
  • 🧪 Test on different platforms

Check out the Contributing Guide to get started.


📜 License

MIT License - See LICENSE for details.

Free to use, modify, and distribute. No restrictions!


👤 Author

Aman Kumar Pandey

Built this as a personal project to learn about AI and voice technology. Feel free to reach out!


🌟 Show Your Support

If you find Vaani useful:

  • ⭐ Star this repo
  • 🐛 Report bugs
  • 💡 Share ideas
  • 🔀 Submit pull requests

Made with ❤️ by Aman Kumar Pandey

About

Vaani is a context-aware voice assistant that bridges the gap between human interaction and machine intelligence. Features hybrid offline/online speech recognition and native media control.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors