A professional web application for audio transcription and summarization, built with a Go backend and Red Hat PatternFly frontend design system.
- 🎤 Audio Recording: Record audio directly from your microphone
- 📁 File Upload: Upload existing WAV audio files
- 🌍 Language Support: Optional language hints for improved transcription accuracy (15+ languages)
- 📝 Transcription: Convert audio to text using OpenAI-compatible Whisper API
- 📊 Summarization: Generate concise summaries using OpenAI-compatible LLM API
- 🎨 Professional UI: Enterprise-grade design with Red Hat PatternFly
- 📋 Copy to Clipboard: Easy copying of transcriptions and summaries
- 📱 Responsive Design: Works on desktop, tablet, and mobile devices
- Pure Go standard library (no external dependencies)
- Serves static files (HTML, CSS, JavaScript)
- Proxies requests to Whisper API for transcription
- Proxies requests to LLM API for summarization
- Handles file uploads up to 500MB
- Simple and maintainable (~300 lines of code)
- Red Hat PatternFly 5 design system
- Vanilla JavaScript (no frameworks)
- Audio recording with MediaRecorder API
- WAV format conversion
- Markdown rendering for summaries (marked.js)
- Red Hat typography (Red Hat Display & Red Hat Text fonts)
- Docker: For building and running the containerized application
- Whisper API Server: OpenAI-compatible transcription service
- LLM API Server: OpenAI-compatible chat completion service
cd transcription-webappSet the required environment variables for your API endpoints:
export AUDIO_INFERENCE_URL=http://localhost:8000
export LLM_INFERENCE_URL=http://localhost:8001Optional environment variables:
export AUDIO_MODEL_NAME=whisper-1 # Default: whisper-1
export LLM_MODEL_NAME=gpt-3.5-turbo # Default: gpt-3.5-turbo
export PORT=8080 # Default: 8080make buildThis will:
- Build a Docker image using Red Hat UBI9
- Compile the Go server inside the container
- Package static files (HTML, CSS, JavaScript)
make runThe application will start and be available at: http://localhost:8080
Open your web browser and navigate to:
http://localhost:8080
| Command | Description |
|---|---|
make help |
Display available commands and usage |
make build |
Build the Docker container image |
make run |
Run the application container |
make stop |
Stop and remove the container |
make clean |
Stop container and remove image |
make logs |
Show application logs (follow mode) |
make restart |
Restart the container |
- Click "Start Recording" to begin recording from your microphone
- Speak clearly into your microphone
- Click "Stop Recording" when finished
- The recording will be automatically converted to WAV format
- An audio player will appear for playback review
- Click "Choose WAV File" under the Upload section
- Select a WAV audio file from your computer
- The filename will be displayed once selected
- Use the "Audio Language" dropdown to optionally specify the spoken language
- Default is "Auto-detect" (Whisper automatically detects the language)
- Supported languages:
- English, French, Spanish, German, Italian
- Portuguese, Dutch, Polish, Russian
- Chinese, Japanese, Korean, Arabic, Hindi, Turkish
- Note: This is a hint for accuracy, NOT for translation
- The audio is transcribed in its original language
- After recording or uploading a WAV file, click "Transcribe Audio"
- A loading spinner will appear during transcription
- The transcription result will be displayed in a text block
- Options:
- Summarize: Generate a summary of the transcription
- Copy: Copy transcription to clipboard
- New Transcription: Reset and start over
- After transcription is complete, click "Summarize"
- A loading spinner will appear during summary generation
- The summary will be displayed with Markdown formatting
- Click "Copy Summary" to copy the plain text to clipboard
Endpoint: POST /v1/audio/transcriptions
Format: OpenAI-compatible API
Request:
- Content-Type:
multipart/form-data - Fields:
file: WAV audio filemodel: Model name (fromAUDIO_MODEL_NAME)language: Optional ISO 639-1 language code
Response:
{
"text": "Transcribed text here..."
}Endpoint: POST /v1/chat/completions
Format: OpenAI-compatible API
Request:
{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant that summarizes transcribed audio..."
},
{
"role": "user",
"content": "Please summarize the following transcription:\n\n..."
}
],
"temperature": 0.7
}Response Formats Supported:
- OpenAI format:
choices[0].message.content - Harmony format:
response - Fallback:
text
| Variable | Required | Default | Description |
|---|---|---|---|
AUDIO_INFERENCE_URL |
Yes | - | Whisper API endpoint (e.g., http://localhost:8000) |
LLM_INFERENCE_URL |
Yes | - | LLM API endpoint (e.g., http://localhost:8001) |
AUDIO_MODEL_NAME |
No | whisper-1 |
Whisper model name |
LLM_MODEL_NAME |
No | gpt-3.5-turbo |
LLM model name |
PORT |
No | 8080 |
Server port |
transcription-webapp/
├── Dockerfile # Multi-stage build with UBI9
├── Makefile # Build and run commands
├── server.go # Go backend (~300 lines)
├── static/
│ ├── index.html # PatternFly UI
│ ├── style.css # Custom Red Hat styles
│ └── app.js # Frontend logic
├── README.md # This file
└── prompt.md # Development specification
- Format: WAV (Waveform Audio File Format)
- Encoding: PCM (Pulse Code Modulation)
- Sample Rates: 16kHz, 44.1kHz, or 48kHz
- Browser Recording: Automatic conversion from MediaRecorder output
- File size limit: 500MB
- HTML escaping to prevent XSS attacks
- Directory traversal prevention
- Input validation for file types
- Non-root container execution (user 1001)
- Modern browsers with MediaRecorder API support
- Chrome, Firefox, Safari, Edge (latest versions)
- Requires HTTPS for microphone access (except localhost)
Issue: "Microphone permission denied" error
Solution:
- Check browser permissions (click the lock icon in address bar)
- Grant microphone access when prompted
- For HTTPS: Ensure valid SSL certificate
- For HTTP: Only works on localhost
Issue: "Transcription service error"
Solution:
- Verify
AUDIO_INFERENCE_URLis correct and accessible - Check that Whisper API server is running
- Ensure WAV file format is valid
- Check application logs:
make logs
Issue: "Summarization service error"
Solution:
- Verify
LLM_INFERENCE_URLis correct and accessible - Check that LLM API server is running
- Ensure transcription text is not empty
- Check application logs:
make logs
Issue: Container exits immediately
Solution:
- Check required environment variables are set
- View logs:
make logs - Verify API endpoints are reachable from container
- Use Docker network if APIs are in containers
If your Whisper and LLM services are also running in Docker containers, create a Docker network:
# Create network
docker network create transcription-net
# Run your API services on the network
docker run -d --name whisper-api --network transcription-net whisper-image
docker run -d --name llm-api --network transcription-net llm-image
# Update Makefile or use environment variables
export AUDIO_INFERENCE_URL=http://whisper-api:8000
export LLM_INFERENCE_URL=http://llm-api:8001
# Run the transcription app on the same network
docker run -d \
--name transcription-app \
--network transcription-net \
-p 8080:8080 \
-e AUDIO_INFERENCE_URL=http://whisper-api:8000 \
-e LLM_INFERENCE_URL=http://llm-api:8001 \
transcription-appIf you prefer to develop without Docker:
# Build the Go server
go build -o transcription-server server.go
# Set environment variables
export AUDIO_INFERENCE_URL=http://localhost:8000
export LLM_INFERENCE_URL=http://localhost:8001
# Run the server
./transcription-serverAccess the application at: http://localhost:8080
For frontend changes (HTML, CSS, JavaScript):
- Edit files in the
static/directory - Refresh your browser (no rebuild needed)
For backend changes (server.go):
- Rebuild:
make build - Restart:
make restart
This project is provided as-is for demonstration and educational purposes.
- Go: Backend server (https://golang.org/)
- PatternFly: Red Hat design system (https://www.patternfly.org/)
- marked.js: Markdown parser (https://marked.js.org/)
- Red Hat Fonts: Red Hat Display & Red Hat Text (https://www.redhat.com/en/about/brand/standards/typography)
- Font Awesome: Icons (https://fontawesome.com/)
For issues or questions:
- Check the Troubleshooting section
- Review application logs:
make logs - Verify environment variables and API endpoints
- Ensure all prerequisites are met
Built with ❤️ using Red Hat PatternFly and Go