transcriber

A Discord bot that transcribes voice messages.

Structure

The bot is split into two parts:

bot/ handles the connection to Discord's gateway and API, and communicates with the workers over Redis.
worker/ handles the transcription jobs, sending results to the "front-end". This component can be independently scaled to as many machines as needed, and jobs will be split across them equally.

To-do

Properly handle messages longer than 2000 characters (right now it just crashes...).
Use message flags to determine voice messages, rather than the name of the file.
Add a context menu action to transcribe voice messages.
(long term) Migrate off of Celery to a more robust task management system, probably something custom-built. This involves a rewrite of the bot.

History

The original version of this bot used whisper.cpp and ran on the CPU. This worked, but was pretty slow, as CPU inference typically is. The solution I came up with for this was to have a two-pass system, where the bot processed messages with the base model first, and then medium for higher quality. Eventually, I was able to upgrade the host machine with a GPU, and configured it to use that instead. However, due to bugs in whisper.cpp's CUDA implementation, it hallucinated a lot, to the point at which the outputs were near unusuable. I eventually just switched to the official implementation, which was fast enough to get rid of the two-pass system. I tried to clean up the code to remove a lot of the two-pass weirdness, but things are still a bit messy.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
bot		bot
worker		worker
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

transcriber

Structure

To-do

History

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

Saghetti0/transcriber

Folders and files

Latest commit

History

Repository files navigation

transcriber

Structure

To-do

History

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages