Skip to content

ai-dev-2024/sonu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

85 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

SONU

SONU

The Open-Source Voice Typing Platform

Type at the speed of thought. Fully offline. Fully private.

Latest Release CI License Stars Platform ZAI Community Ko-fi

Download ยท Features ยท Showcase ยท Compare ยท Docs ยท Contribute


Built with Tauri v2 (Rust) + React โ€” Dictate anywhere, your words appear instantly in any application.


โœจ Features

๐Ÿ”’ 100% Offline & Private

All transcription runs locally on your device. No audio ever leaves your machine. No accounts, no cloud, no subscriptions. Your voice stays yours.

โšก Real-Time Transcription

Powered by optimized whisper.cpp and Parakeet engines for blazing-fast, real-time voice-to-text. Start speaking and see words appear instantly.

๐Ÿค– AI Text Enhancement

Optional LLM post-processing cleans up filler words, fixes grammar, and formats your text โ€” all locally with offline models, or via cloud providers.

๐ŸŒ 50+ Languages

Transcribe in over 50 languages with automatic language detection. Switch languages on the fly or lock to a specific one.

โŒจ๏ธ Universal Auto-Type

SONU types directly into any application โ€” your browser, IDE, email client, Slack, Discord, Word โ€” anywhere you can type.

โ˜๏ธ Cloud Transcription (Optional)

Connect to Groq, Deepgram, or your own self-hosted server for cloud-powered transcription when you want maximum accuracy.

๐Ÿ“š Smart Dictionary

Custom word corrections automatically fix domain-specific terms, names, and jargon that the model might mishear.

๐Ÿ“ Snippets & Text Expansion

Define shorthand codes that expand into full text blocks โ€” perfect for emails, code comments, addresses, and common phrases.


๐Ÿ“ธ Showcase โ€” Tauri v2 App

SONU v2.2.0 โ€” Built with Tauri v2 (Rust + React). Lightweight, native, and fast.

๐Ÿ  Home Dashboard

The home screen shows your dictation stats (time, word count, WPM, time saved), a voice activation shortcut recorder, privacy status, and recent transcription history โ€” all in a clean dashboard layout with local/cloud mode indicator.

๐Ÿ“š Dictionary & โœ‚๏ธ Snippets

Dictionary lets you add custom word corrections for domain-specific terms the model might mishear. Snippets are reusable text blocks you can expand with shorthand codes โ€” perfect for emails, addresses, and common phrases.

๐Ÿ“ Notes

Voice-powered sticky notes with color-coded cards (6 colors), search, grid/list view toggle, and per-note audio playback โ€” saved/starred transcriptions become visual notes.

๐ŸŽจ Style

Choose AI dictation style presets organized by category: Personal, Work, Email, Other. Each style (Casual, Professional, Technical, Creative, etc.) transforms your raw transcription with LLM post-processing.

โš™๏ธ Settings

  • General โ€” Shortcut binding, language, microphone, audio feedback, push-to-talk
  • Advanced โ€” Autostart, overlay, clipboard handling, model unload timeout, AI post-processing toggle
  • Cloud โ€” Provider cards for Groq, Deepgram, and custom self-hosted servers with status indicators
  • Post-Processing โ€” LLM provider config, model selection, API keys, custom prompts
  • History โ€” Full transcription log with audio playback, copy, star/save, and delete
  • Debug โ€” Log level, sound themes, thresholds, recording retention, advanced toggles
  • About โ€” App version, language, data directory, credits, and links

๐Ÿ“ท Screenshots coming soon โ€” The Tauri v2 app is built and running. Take screenshots with bun run tauri dev in apps/tauri-v2/.


๐Ÿ† How SONU Compares

Feature SONU Wispr Flow Superwhisper macOS Dictation
Fully offline โœ… โŒ โœ… Partial
Open source โœ… โŒ โŒ โŒ
Free forever โœ… โŒ ($10/mo) โŒ ($8/mo) โœ…
Windows + macOS + Linux โœ… macOS only macOS only macOS only
50+ languages โœ… โœ… โœ… โœ…
Custom dictionary โœ… โŒ โŒ โŒ
Text snippets โœ… โŒ โŒ โŒ
AI text enhancement โœ… โœ… โœ… โŒ
Offline LLM support โœ… โŒ โŒ โŒ
Cloud transcription option โœ… โœ… โŒ โœ…
Self-hosted server โœ… โŒ โŒ โŒ
Voice notes โœ… โŒ โœ… โŒ
Push-to-talk + toggle โœ… โœ… โœ… โœ…
Auto-type into any app โœ… โœ… โœ… โœ…
Multiple Whisper models โœ… (tiny โ†’ large-v3) โŒ โœ… โŒ
Themes & customization โœ… Limited Limited โŒ

โฌ‡๏ธ Download

Get SONU for your platform

Platform Download Architecture
Download Installer (.exe) x64, ARM64
Download DMG Intel (x64) + Apple Silicon (ARM64)
Download AppImage / .deb / .rpm x64

Download Latest Total Downloads

Quick Install

Windows
  1. Download the .exe installer from Releases
  2. Run the installer and follow the prompts
  3. Launch SONU from the Start Menu or system tray
  4. Press your hotkey (default: Ctrl+Shift+Space) and start speaking
macOS
  1. Download the .dmg from Releases
  2. Open the DMG and drag SONU to Applications
  3. Grant Accessibility permissions when prompted
  4. Press your hotkey and start speaking
Linux
  1. Download .AppImage (portable) or .deb (Debian/Ubuntu) from Releases
  2. For AppImage: chmod +x SONU-*.AppImage && ./SONU-*.AppImage
  3. For .deb: sudo dpkg -i sonu_*.deb
  4. Press your hotkey and start speaking

๐Ÿง  Supported Models

SONU supports multiple speech recognition engines and models:

Model Size Speed Accuracy Best For
tiny 75 MB โšกโšกโšกโšกโšก โ˜…โ˜…โ˜†โ˜†โ˜† Quick notes, low-resource machines
base 142 MB โšกโšกโšกโšก โ˜…โ˜…โ˜…โ˜†โ˜† Everyday dictation
small 466 MB โšกโšกโšก โ˜…โ˜…โ˜…โ˜…โ˜† Professional use
medium 1.5 GB โšกโšก โ˜…โ˜…โ˜…โ˜…โ˜† High-accuracy work
large-v3 3.1 GB โšก โ˜…โ˜…โ˜…โ˜…โ˜… Maximum accuracy
Parakeet 0.6B 600 MB โšกโšกโšกโšก โ˜…โ˜…โ˜…โ˜…โ˜… English โ€” best speed/accuracy ratio

Models download automatically on first use. All processing stays local.


๐Ÿ—๏ธ Architecture

SONU/
โ”œโ”€โ”€ apps/
โ”‚   โ”œโ”€โ”€ tauri-v2/          ๐Ÿฆ€ Tauri v2 desktop app (Rust + React)
โ”‚   โ”‚   โ”œโ”€โ”€ src/           React/TypeScript frontend
โ”‚   โ”‚   โ””โ”€โ”€ src-tauri/     Rust backend (whisper.cpp, audio, models)
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ desktop/           ๐Ÿ–ฅ๏ธ Electron desktop app (Node.js + Python)
โ”‚       โ””โ”€โ”€ src/           Main process, services, IPC
โ”‚
โ”œโ”€โ”€ server/                ๐ŸŒ Self-hosted transcription server (FastAPI + Docker)
โ”œโ”€โ”€ docs/                  ๐Ÿ“š Documentation & guides
โ””โ”€โ”€ plans/                 ๐Ÿ“‹ Roadmap & improvement plans

Tech Stack

Layer Technology
Desktop Framework Tauri v2 (Rust)
Frontend React 18, TypeScript, TailwindCSS
Speech Engine whisper.cpp, Parakeet TDT
AI Enhancement Local LLM (GGUF) + Cloud providers (OpenAI, Groq, etc.)
Cloud Transcription Groq, Deepgram, Custom server (FastAPI)
Security OS Keychain, Context Isolation, CSP, Input Validation
Testing Vitest, Playwright, Rust tests, GitHub Actions CI

๐Ÿš€ Development

Prerequisites

Quick Start

# Clone the repository
git clone https://github.com/ai-dev-2024/sonu.git
cd sonu/apps/tauri-v2

# Install dependencies
bun install

# Run in development
bun run tauri dev

# Build for production
bun run tauri build

Commands

bun run dev           # Start Vite dev server
bun run tauri dev     # Start full Tauri dev environment
bun run build         # Build frontend
bun run tauri build   # Build production binary
bun run test          # Run Vitest unit tests
bun run test:e2e      # Run Playwright E2E tests
bun run lint          # ESLint check
bun run format        # Prettier format
bun run typecheck     # TypeScript check

Self-Hosted Server

Run your own transcription server with Docker:

cd server
docker compose up -d

See server/README.md for full setup instructions.


๐Ÿ›ก๏ธ Security

SONU is designed with security-first principles:

  • ๐Ÿ”’ No telemetry โ€” Zero data collection, no analytics, no phone-home
  • ๐Ÿ” OS Keychain โ€” API keys stored in your OS's secure credential store
  • ๐Ÿงฑ Context Isolation โ€” Renderer process fully sandboxed
  • ๐Ÿ›ก๏ธ CSP Headers โ€” Content Security Policy prevents injection attacks
  • โœ… Input Validation โ€” All IPC parameters validated against schemas
  • ๐Ÿ“ Path Sanitization โ€” Prevents path traversal attacks
  • ๐Ÿšซ No eval() โ€” ESLint enforces no dynamic code execution

๐Ÿ›ฃ๏ธ Roadmap

โœ… Shipped

  • Offline voice-to-text with Whisper & Parakeet
  • AI text enhancement (local + cloud LLMs)
  • Cloud transcription (Groq, Deepgram, custom server)
  • Custom dictionary & text snippets
  • Voice notes with search & playback
  • Multi-theme support (dark, light, custom)
  • 50+ language support with auto-detection
  • Cross-platform support (Windows, macOS, Linux)

๐Ÿšง In Progress

  • Real-time streaming transcription
  • Custom model fine-tuning
  • Plugin / extension system
  • Voice commands & macros

๐Ÿ”ฎ Future

  • Team collaboration features
  • Mobile companion app
  • Browser extension
  • Cloud sync (optional, encrypted)

๐Ÿค Contributing

We welcome contributions! Whether it's bug fixes, features, translations, or docs:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes and add tests
  4. Run checks: bun run lint && bun run test && bun run typecheck
  5. Commit: git commit -m "feat: add amazing feature"
  6. Push and open a Pull Request

See AGENTS.md for development guidelines and coding conventions.


๐Ÿ“š Documentation

Document Description
AGENTS.md AI assistant guidelines & build commands
CHANGELOG.md Version history & release notes
docs/DEVELOPMENT.md Development setup guide
docs/CONTRIBUTING.md Contribution guidelines
docs/TAURI_V2_MIGRATION_GUIDE.md Tauri v2 migration guide
server/README.md Self-hosted server setup
plans/CODEBASE_IMPROVEMENT_PLAN.md Future improvement roadmap

๐Ÿ“ License

MIT License โ€” free for personal and commercial use.


๐Ÿ™ Acknowledgments

  • whisper.cpp โ€” Fast C++ Whisper inference
  • Tauri โ€” Secure, lightweight desktop framework
  • NVIDIA Parakeet โ€” High-accuracy English ASR
  • Electron โ€” Cross-platform desktop apps

Made with โค๏ธ for people who think faster than they type.

โญ Star on GitHub ยท Download

SONU is not affiliated with OpenAI. Whisper is a trademark of OpenAI.

About

Professional offline voice typing application powered by Faster-Whisper AI

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors