Simple Local Transcriber

Simple local transcribe (slt) is a CLI tool to transcribe videos / audios to Russian language.

Prerequisites

Tool requires python3.11, you can install it using pyenv for example.

The tool requires additional models from huggingface: https://huggingface.co/pyannote/segmentation-3.0 . So you should get access to these models, create huggingface token and export it as environment variable:

export HF_TOKEN=<hf-token-here>

Background

Download pyannote/segmentation-3.0 model by hands and put it to $HOME/.cache/huggingface/hub/models--pyannote--segmentation-3.0/snapshots/e66f3d3b9eb0873085418a7b813d3b369bf160bb/

Installation

To install tool:

Clone repository from github: git clone git@github.com:qaohv/simple-local-transcriber.git
Install it with pip: cd simple-local-transcriber && pip install -e .

Usage

slt provides four commands:

Usage: slt [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  download
  extract-audio
  ru-transcribe
  ru-video-transcribe

Details about each of them you can find below.

slt ru-video-transcribe "url-to-video"

Command to transcribe youtube video, it contains following steps:

1. Downloading video from youtube and save to disk.
2. Extract audio from saved video.
3. Transcribe saved audio and print it to standard output.

Example:
```
slt ru-video-transcribe https://www.youtube.com/watch?v=xuYMkaNo_gI
```

As result, transcription in following format will be printed in console:

[timestamp_start1 - timestamp_finish1]: some text 1
[timestamp_start2 - timestamp_finish2]: some text 2

slt download "url-to-video"

Command to download video from youtube and save it on a disk. 

Example:
```
slt download https://www.youtube.com/watch?v=xuYMkaNo_gI
```

As result video with name `downloaded_video.webm` will be saved on a disk.

slt extract-audio "path-to-video"

Command to extract audio from youtube video and save it on a disk. 

Example:
```
slt extract-audio downloaded_video.webm
```

As result audio with name `downloaded_video.wav` will be saved on a disk.

slt ru-transcribe "path-to-audio"

Command to transcribe audio and print text to standard output. 

Example:
```
slt ru-transcribe downloaded_video.wav
```

As result, transcription in following format will be printed in console:

[timestamp_start1 - timestamp_finish1]: some text 1
[timestamp_start2 - timestamp_finish2]: some text 2

As you may have noticed the first command is a sequential call to the next three. The tool provides audio extraction and transcribing functionality in case the user already has a video/audio file on the system.

Usefull tip:

You can save transcribtion of audio to text file with printint to stdout using tee command, for example:

slt <audio-file> | tee <text-file>

Technical details

For speech-to-text GigaAM-E2E-RNNT model are used. You can change model via env variable SLT_STT_MODEL, possible options: v3_e2e_rnnt, v3_e2e_ctc, v3_rnnt, v3_ctc. Also, as mentioned above, https://huggingface.co/pyannote/segmentation-3.0 model is used for long audio recognition.

All models are running locally, so, if you trust hf/salute-developers models you may transcribe sensitive information.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
src/transcriber		src/transcriber
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple Local Transcriber

Prerequisites

Background

Installation

Usage

Technical details

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

qaohv/simple-local-transcriber

Folders and files

Latest commit

History

Repository files navigation

Simple Local Transcriber

Prerequisites

Background

Installation

Usage

Technical details

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages