Skip to content

Flask-based media transcription app that lets you browse local audio/video files, process them asynchronously using OpenAI’s Whisper model, and download accurate text transcriptions in seconds.

Notifications You must be signed in to change notification settings

achrafness/Transcripta

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Media to Text Converter

A Flask web application that converts audio and video files to text using OpenAI's Whisper model.

Features

  • Upload multiple media files at once
  • Drag and drop file upload interface
  • Real-time processing progress tracking
  • Support for various audio/video formats (MP3, MP4, WAV, M4A, etc.)
  • Download transcribed text as .txt files
  • Copy text to clipboard functionality
  • Responsive web interface

Installation

  1. Install Python dependencies:
pip install -r requirements.txt
  1. Install FFmpeg (required by Whisper):
# Ubuntu/Debian
sudo apt update && sudo apt install ffmpeg

# CentOS/RHEL
sudo yum install ffmpeg

# macOS
brew install ffmpeg

Usage

  1. Run the Flask application:
python app.py
  1. Open your web browser and go to http://localhost:5000

  2. Upload one or more media files using the drag-and-drop interface

  3. Wait for the conversion to complete and view/download your transcribed text

Supported File Formats

  • Audio: MP3, M4A, WAV, MPGA
  • Video: MP4, MPEG, WebM, MOV, AVI, FLV, MKV

Configuration

You can modify the Whisper model in app.py by changing the model name in the load_whisper_model() function:

  • tiny.en - Fastest, English only
  • base.en - Better accuracy, English only
  • small.en - Good balance of speed and accuracy
  • medium.en - Higher accuracy
  • large - Best accuracy, supports multiple languages

API Endpoints

  • GET / - Main upload page
  • POST /upload - Upload files for processing
  • GET /status/<job_id> - Check processing status
  • GET /result/<job_id> - View transcription result
  • GET /download/<job_id> - Download transcription as text file

Notes

  • Maximum file size: 500MB
  • Files are processed asynchronously in the background
  • Uploaded files are automatically deleted after processing
  • Transcribed text files are saved temporarily for download

About

Flask-based media transcription app that lets you browse local audio/video files, process them asynchronously using OpenAI’s Whisper model, and download accurate text transcriptions in seconds.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published