sound2transcript

Capture macOS system audio from video lectures and transcribe locally using whisper.cpp. No cloud. No subscription. Supports English and Brazilian Portuguese via auto-detection.

What it does

Routes system audio through BlackHole 2ch virtual driver
Records sessions as 16kHz mono WAV via ffmpeg
Transcribes with whisper-cli (default: large-v3-turbo with Metal GPU acceleration on Apple Silicon)
Outputs .txt (required), optionally .srt and .vtt
Garbage collects old recordings on schedule via launchd

Install

Option A: Homebrew (recommended)

brew tap jmcoimbra/tap
brew install sound2transcript

This installs stream-transcribe and sound2transcript-gc into your PATH, and pulls in ffmpeg and whisper-cpp as dependencies automatically.

After installing, complete the one-time setup:

# 1. Install the virtual audio driver
brew install --cask blackhole-2ch

# 2. Download the Whisper model (1.5 GB)
curl -L --progress-bar \
  -o "$(brew --prefix)/var/sound2transcript/models/ggml-medium.bin" \
  "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-medium.bin"

# 3. Configure audio routing (required)
open docs/SETUP.md  # or see Audio Routing below

Option B: From source

git clone https://github.com/jmcoimbra/sound2transcript.git
cd sound2transcript
make install
make download-model

Requires Homebrew and macOS 13+.

Update

Homebrew

brew update
brew upgrade sound2transcript

That's it. Homebrew handles fetching the new version and replacing the binaries.

Your transcripts, recordings, model, and config are untouched - they live in the data directory, not in the Homebrew prefix.

From source

cd sound2transcript
git pull
make install

Re-runs the install, copying updated scripts over the existing ones. Your data and config are preserved.

Uninstall

Homebrew

brew uninstall sound2transcript
brew untap jmcoimbra/tap  # optional: remove the tap

From source

make uninstall

Both methods leave your data at ~/sound2transcript/ intact. Remove it manually if you no longer need it:

rm -rf ~/sound2transcript

Audio routing

See docs/SETUP.md - required one-time setup to route system audio through BlackHole before first use.

Choosing a model

See docs/MODELS.md for model comparison, architecture-specific install instructions (Apple Silicon vs Intel), and thread tuning.

Apple Silicon users: Make sure you're using ARM Homebrew (/opt/homebrew/bin/brew), not Intel Homebrew (/usr/local/bin/brew). Intel Homebrew runs under Rosetta 2 and produces binaries with no Metal GPU access - transcription will be 10-20x slower.

Use

Start recording:

stream-transcribe

Press Ctrl+C to stop. Transcription runs automatically. Output goes to ~/sound2transcript/transcripts/.

Keep the WAV file after transcription:

stream-transcribe --keep

To always keep recordings, set KEEP_RECORDINGS="1" in your config.

Check version:

stream-transcribe --version

Schedule garbage collection (optional)

make install-launchd

Runs daily at 03:30, removing old WAV files and enforcing disk caps.

Directory layout

~/sound2transcript/
├── models/         # whisper model files
├── recordings/     # intermediate WAV files (deleted after transcription unless --keep)
├── transcripts/    # output .txt / .srt / .vtt
├── logs/           # session and gc logs
└── config/         # config.env

Configuration

All settings are in ~/sound2transcript/config/config.env:

Variable	Default	Description
`BLACKHOLE_DEVICE_NAME`	`BlackHole 2ch`	Audio loopback device name
`MODEL_PATH`	`~/sound2transcript/models/ggml-large-v3-turbo-q5_0.bin`	Whisper model path (guide)
`LANG`	`auto`	Language: `auto`, `en`, or `pt`
`OUTPUT_TXT`	`1`	Generate .txt output
`OUTPUT_SRT`	`1`	Generate .srt subtitles
`OUTPUT_VTT`	`0`	Generate .vtt subtitles
`RECORDINGS_RETENTION_DAYS`	`3`	Days to keep WAV files
`TRANSCRIPTS_RETENTION_DAYS`	`90`	Days to keep transcripts (0 = forever)
`RECORDINGS_MAX_GB`	`10`	Max disk for recordings
`WHISPER_THREADS`	`4`	CPU threads for transcription
`SILENCE_THRESHOLD_DB`	`-50`	Volume threshold (dB) below which recording is flagged silent
`KEEP_RECORDINGS`	`0`	Keep WAV after transcription (`1`=keep, `0`=delete). Override with `--keep`
`LOG_LEVEL`	`info`	Log verbosity: `info`, `warn`, or `error`

Development

make lint       # shellcheck + shfmt
make test       # bats-core tests
make check      # lint + test

Releasing a new version

Bump the version in VERSION
Commit: git commit -am "Bump version to X.Y.Z"
Tag and push: make release
Create the GitHub release: gh release create vX.Y.Z
Update the SHA in the homebrew-tap formula

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.circleci		.circleci
Formula		Formula
bin		bin
config		config
docs		docs
launchd		launchd
tests		tests
.codacy.yml		.codacy.yml
.gitignore		.gitignore
.shellcheckrc		.shellcheckrc
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
VERSION		VERSION
logo.svg		logo.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sound2transcript

What it does

Install

Option A: Homebrew (recommended)

Option B: From source

Update

Homebrew

From source

Uninstall

Homebrew

From source

Audio routing

Choosing a model

Use

Schedule garbage collection (optional)

Directory layout

Configuration

Development

Releasing a new version

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

sound2transcript

What it does

Install

Option A: Homebrew (recommended)

Option B: From source

Update

Homebrew

From source

Uninstall

Homebrew

From source

Audio routing

Choosing a model

Use

Schedule garbage collection (optional)

Directory layout

Configuration

Development

Releasing a new version

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages