Turn your Discord TTRPG sessions into organized campaign notes automatically.
Note
This project was inspired by: Automating D&D Notetaking with AI Original code inside the transcript_cleanup folder from dnd-transcript-cleanup
Transform hours of audio recordings into comprehensive campaign documentation:
- One command: audio files → clean transcripts → campaign documents
- AI-powered: Generate NPC profiles, location notes, story summaries
- Flexible: Works with any audio format, customizable processing
- Complete: From recording to campaign documentation in minutes
You'll need Python 3.10 through 3.13 (Python 3.14+ support coming soon). Check your version:
python3 --versionPython 3.12 is recommended for the best experience with AI features. If you have Python 3.9 or older, upgrade to Python 3.12.
Note
Python 3.14 was just released and the numba dependency (required by Whisper) doesn't support it yet. If you have Python 3.14, use Python 3.12 instead (see instructions below).
FFmpeg is required to transcribe audio files. Install it before running transcription:
macOS (using Homebrew):
brew install ffmpegLinux (Ubuntu/Debian):
sudo apt-get update
sudo apt-get install ffmpegWindows: Download from ffmpeg.org and add to your PATH.
Verify installation:
ffmpeg -versionIf you get "command not found", ffmpeg is not installed or not in your PATH.
On a Mac, install Homebrew and install Python with brew install python
For Apple Silicon
export PATH="/opt/homebrew/bin:$PATH"
For Intel Mac
export PATH="/usr/local/bin:$PATH"
Then run source ~/.zshrc to reload.
For zsh
echo 'alias python=python3' >> ~/.zshrc
echo 'alias pip=pip3' >> ~/.zshrc
source ~/.zshrc
Using a virtual environment keeps this project's dependencies separate from your system Python:
# 1. Create a virtual environment in the project directory
# If you have multiple Python versions, use python3.12 explicitly:
python3.12 -m venv venv
# 2. Activate the virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
# venv\Scripts\activate
# You should see (venv) in your terminal prompt now
# 3. Verify Python version inside venv
python --version # Should show python3.12
# 4. Install dependencies
pip install -r requirements.txtIf you prefer to install system-wide (not recommended):
pip install -r requirements.txtNote: Remember to activate your virtual environment (source venv/bin/activate) each time you use the project.
For AI-powered NPC and location generation, set up API keys:
# 1. Copy the example environment file
cp .env.example .env
# 2. Edit .env with your API keys (choose one or more providers):
# - OpenAI: Get key from https://platform.openai.com/api-keys
# - Anthropic: Get key from https://console.anthropic.com/
# - Google: Get key from https://console.cloud.google.com/
# Example .env content:
# ANTHROPIC_API_KEY=sk-ant-your-key-here
# OPENAI_API_KEY=sk-your-openai-key-hereCost: AI generation typically costs $0.01-0.10 per session depending on transcript length and provider.
# Basic workflow: audio → clean transcripts
python main.py process your_session_audio/ --output-dir session_01 --all-steps
# 🚀 NEW: Add AI campaign generation
python main.py process your_session_audio/ --output-dir session_01 --all-steps --generate-campaign
# Or with existing transcript files
python main.py process your_transcripts/ --output-dir session_01 --cleanup-only
# Standalone AI generation from existing transcripts (recommended for testing)
python main.py generate session_01/Session_*_Final_COMPLETE.txt --output-dir campaign_docs --prompts dm_base_helper LOCATIONS_template NPC_template PC_metadata PC_tracker
# View your results
ls session_01/
# Session_complete_Final_COMPLETE.txt ← Your clean transcript
# *.csv files ← Processed data files
ls campaign_docs/ # AI-generated campaign documents
# NPC_your-session.md ← All NPCs found in session
# LOCATIONS_your-session.md ← All locations found in session- Audio → Text: Whisper AI transcribes each player's audio separately
- Clean & Organize: Remove duplicates, fix timing, merge speakers chronologically
- Smart Corrections: Automatic text corrections for 40+ common RPG terms (fix "stilmiss choir" → "Stillness Choir", "wizardry" → "wizardry", etc.)
- Campaign Notes: Use AI prompts to generate NPC profiles, location docs, story summaries
Input: 45 minutes of Discord audio with 6 players
Output: Clean 8-page transcript + AI-ready prompts for campaign management
# Everything: transcribe → clean → organize
python main.py process session_audio/ --output-dir campaign_session_01 --all-steps
# Customize session info
python main.py process audio/ --output-dir session --session-name "Curse of Strahd" --session-part "Episode_3"# Just transcribe audio files
python main.py transcribe audio.flac --output-dir transcripts
# Just clean existing transcripts
python main.py cleanup --base-path transcripts --session-name "My Campaign"
# Text corrections are now automatic (use --config for custom terms)
# Just generate AI campaign documents
python main.py generate transcript.txt --output-dir campaign --prompts NPC_template LOCATIONS_template
# Get help for any command
python main.py --help
python main.py generate --help# 1. Record with Craig Discord bot → download audio files
# 2. One command processing (includes automatic text corrections)
python main.py process session_audio/ --output-dir "session_12" --all-steps --session-name "Waterdeep" --session-part "episode_12"
# 3. Done! Text corrections are automatic (40+ built-in RPG term corrections)
# 4. Use transcript with AI prompts from AI_Prompts/ folderWant to add your own campaign-specific corrections?
# 1. Copy the example config
cp shared_utils/example_config.py my_config.py
# 2. Edit my_config.py and add your terms to DEFAULT_TEXT_REPLACEMENTS:
# "Your BBEG Name": ["bbeg mishear1", "bbeg mishear2"],
# "Your Important NPC": ["npc mishear1", "npc mishear2"],
# 3. Use your custom config
python main.py process audio/ --output-dir session_01 --all-steps --config my_config.pyBuilt-in corrections include: Stillness Choir, Ember Thieves, wizard, rogue, perception, armor, crossbow, and 30+ more RPG terms.
# Quick processing for single session
python main.py process oneshot_audio/ --output-dir "halloween_oneshot" --all-steps
# Generate story summary using AI_Prompts/dm_simple_story_summarizer.txt# Process multiple old sessions
for session in session_*_audio/; do
python main.py process "$session" --output-dir "${session%_audio}" --all-steps
doneThe system now includes automated AI-powered campaign document generation that transforms your session transcripts into comprehensive campaign materials.
The AI system analyzes your complete session transcript and generates comprehensive campaign documents in one step:
What it generates:
- NPC Documentation: Complete profiles with motivations, relationships, dialogue patterns, and roleplaying notes
- Location Documentation: Detailed location descriptions with history, secrets, and plot hooks
- Story Summaries: Narrative summaries and session recaps
- Character Tracking: Player character development and session highlights
# Complete workflow: audio → transcripts → AI campaign docs
python main.py process your_session_audio/ --output-dir session_01 --all-steps --generate-campaign
# Generate from existing transcript (recommended for testing)
python main.py generate session_01/Session_*_Final_COMPLETE.txt --output-dir campaign_docs --prompts NPC_template LOCATIONS_templatecampaign_docs/
├── NPC_your-session.md # All NPCs with full profiles and quotes
└── LOCATIONS_your-session.md # All locations with detailed descriptions
- Cost: $0.01-0.10 per session (varies by transcript length and provider)
- Speed: 30-60 seconds per document
- Accuracy: Direct analysis provides 100% relevant content (no entity extraction errors)
NPC_template: Comprehensive NPC analysis with physical descriptions, motivations, relationships, and roleplaying notesLOCATIONS_template: Detailed location documentation including events, history, secrets, and plot hooksdm_simple_story_summarizer: NY Times-style short stories (10+ pages, Stephen King/Neil Gaiman style)PC_tracker: Individual player character session analysis with relationships, quotes, and development trackingdm_encounter_template: Structured encounter documentationPC_metadata: Player character metadata and background information
# Copy example environment file
cp .env.example .env
# Add your API keys (choose one or more providers):
# ANTHROPIC_API_KEY=sk-ant-your-key-here # Claude (recommended for long transcripts)
# OPENAI_API_KEY=sk-your-openai-key-here # ChatGPT (fast and cost-effective)
# GOOGLE_API_KEY=your-google-key-here # Gemini (alternative option)Current Status: Step 1 (Direct Generation) is complete and working. Step 2 (Intelligent Merging) is on hold for testing and template refinement.
Test the system:
# Generate documents from your transcript
python main.py generate your_transcript.txt --output-dir test_campaign --prompts NPC_template
# Review the generated markdown files
ls test_campaign/
cat test_campaign/NPC_*.mdTemplate Enhancement: The AI prompts in AI_Prompts/ can be customized for your campaign style. Modify the templates to match your preferred output format.
Here's what to expect when testing the AI generation with a typical D&D session transcript:
Input: Session transcript (8 pages) containing:
- 3 NPCs: Tavern keeper "Grenda", Guard captain "Marcus", Mysterious stranger "Vex"
- 2 Locations: "The Prancing Pony" tavern, "Westgate Guard Tower"
- 1 Combat encounter with bandits
- Character dialogue and roleplay
Generated Output:
NPC_test-session.md (example excerpt):
---
prompt_type: NPC_template
session_name: test-session
generated_date: 2024-01-15T14:30:25
provider: anthropic
auto_generated: true
---
# NPCs from Test Session
## Grenda - The Prancing Pony Tavern Keeper
**Physical Description**: A stout halfling woman with graying brown hair tied back in a practical bun...
**Personality**: Warm but no-nonsense, protective of her establishment and regular customers...
**Key Quotes**:
- "You look like trouble, but your coin's good here."
- "Haven't seen Marcus this worried since the goblin raids."
**Relationships**:
- **Marcus**: Old friend, provides information about local threats
- **Party**: Cautiously helpful, appreciates their gold
## Marcus - Westgate Guard Captain
**Physical Description**: Human male, mid-40s, weathered face with a distinctive scar across his left cheek...LOCATIONS_test-session.md (example excerpt):
---
prompt_type: LOCATIONS_template
session_name: test-session
generated_date: 2024-01-15T14:30:45
provider: anthropic
auto_generated: true
---
# Locations from Test Session
## The Prancing Pony - Tavern & Inn
**Description**: A two-story stone and timber building with a thatched roof...
**Atmosphere**: Warm and welcoming, with the smell of hearty stew and fresh bread...
**Important Events**:
- Party gathered information about bandit attacks
- Met mysterious stranger "Vex" who offered a job
- Marcus arrived with urgent news about missing patrols
**Secrets & Hooks**:
- Grenda knows more about the local bandits than she lets on
- Regular meeting place for information brokersPerformance: Generated in ~45 seconds, cost ~$0.05, 2 comprehensive documents created.
The AI prompt templates in AI_Prompts/ can be customized for better results with your specific campaign style:
1. Copy and modify existing templates:
# Make a backup first
cp AI_Prompts/NPC_template.txt AI_Prompts/NPC_template_custom.txt
# Edit with your preferred format
nano AI_Prompts/NPC_template_custom.txt2. Template customization tips:
- Add campaign-specific context: Include your world's races, factions, or terminology
- Specify output format: Request bullet points, tables, or specific markdown structure
- Include example outputs: Show the AI exactly what format you want
- Add constraints: Specify word counts, detail levels, or focus areas
3. Example customizations:
For a sci-fi campaign:
Focus on: Technology levels, cybernetic implants, corporate affiliations
Include: Threat assessment, security clearance, known aliases
Format: Corporate dossier style with threat ratings
For a political intrigue campaign:
Emphasize: Social connections, secrets, leverage opportunities
Include: Family trees, political affiliations, blackmail material
Format: Intelligence briefing with relationship maps
4. Test your custom templates:
# Test with your custom template
python main.py generate your_transcript.txt --output-dir test --prompts NPC_template_custom
# Compare results
diff test/NPC_*.md original/NPC_*.md- Shorter prompts = faster generation (and lower cost)
- Specific instructions = better results than generic requests
- Examples in prompts = consistent formatting across sessions
- Constraints help focus (e.g., "limit to 3 NPCs per location")
Claude (Anthropic): Excellent with longer, detailed prompts and context
ChatGPT (OpenAI): Best with structured, step-by-step instructions
Gemini (Google): Good with concise, focused prompts
Test the same template across providers to find what works best for your campaign style.
- Direct LLM Generation: Analyzes complete transcripts and generates accurate campaign documents
- Multi-Provider Support: Claude, OpenAI GPT, and Google Gemini integration via LiteLLM
- Template System: Customizable AI prompts for different document types
- Obsidian Integration: Generated markdown files work seamlessly with Obsidian vaults
- Cost-Effective: Typical cost of $0.01-0.10 per session
- High Accuracy: 100% success rate, no false entities or nonsense files
Intelligent Merging System - Currently paused for testing and template improvement:
- Document Merging: Combining new AI content with existing campaign documents
- Entity Resolution: Detecting duplicate NPCs/locations across sessions
- Content Preservation: Maintaining user modifications while adding new information
- Version Control: Tracking changes and providing rollback capabilities
- Template Optimization: Improving AI prompt quality for better outputs
- Cost Analysis: Comparing providers for optimal cost/quality ratios
- User Testing: Gathering feedback on generated document quality
- Format Refinement: Enhancing markdown structure and frontmatter
Short Term (Next Phase):
- Complete Step 2 implementation (intelligent merging)
- Advanced entity resolution with fuzzy matching
- Conflict resolution for overlapping content
- User preference system for merge strategies
Medium Term:
- Campaign timeline generation from multiple sessions
- Cross-session relationship tracking
- Advanced plot hook identification
- Integration with popular VTT platforms
Long Term:
- Real-time session analysis during gameplay
- Voice-to-campaign-docs pipeline automation
- Multi-campaign universe management
- AI-powered campaign planning assistance
Current System:
- Each session generates separate documents (no automatic merging yet)
- Manual template customization required for specialized campaigns
- API costs scale with transcript length
- Limited to text-based analysis (no audio/visual processing)
Technical Constraints:
- Requires Python 3.10+ for AI dependencies
- Internet connection required for LLM providers
- API key management needed for production use
Create config.json for your campaign:
{
"cleanup": {
"session_name": "curse_of_strahd",
"base_path": "/path/to/sessions"
},
"name_mappings": {
"discord_user123": "Player: Sarah (Character: Elara)",
"gamer_dude": "Player: Mike (Character: Thorin)"
}
}export TTRPG_SESSION_NAME="my_campaign"
export TTRPG_BASE_PATH="/path/to/sessions"
export TTRPG_WHISPER_MODEL="turbo"{
"cleanup": {
"enable_remove_duplicates": true,
"enable_merge_segments": true,
"enable_remove_short": false,
"short_duplicate_text_length": 4,
"merge_threshold": 0.01
}
}# Skip specific processing steps
python main.py cleanup --skip-duplicates --skip-merge
# Use different Whisper model
python main.py transcribe audio.flac --model large-v2 --no-fp16# Verify installation
pytest tests/ -v
# Test the CLI interface
python main.py --help
python main.py process --help# Test with your own data
python main.py process your_audio_files/ --output-dir test_output --all-steps
# Or test cleanup on existing transcripts
python main.py cleanup --base-path your_transcripts/ --session-name "test_session"pip install -r requirements.txtIf you get an error about numba not supporting your Python version:
# Check which Python versions you have installed
python3 --version
python3.12 --version
python3.13 --version
# Error: "Cannot install on Python version 3.14.0"
# Solution: Use Python 3.12 or 3.13 instead
rm -rf venv
python3.12 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Verify you're using the correct version
python --version # Should show 3.10-3.13Note: Python 3.14 support is coming soon but numba (required by Whisper) doesn't support it yet. Use Python 3.12 for the best experience.
Error: [Errno 2] No such file or directory: 'ffmpeg'
# FFmpeg is required for audio transcription. Install it first:
brew install ffmpeg # macOS
# or: sudo apt-get install ffmpeg # Linux
# Verify installation
ffmpeg -versionOther transcription issues:
# Use CPU-optimized model
python main.py transcribe audio.flac --model base --no-fp16
# For low-quality audio
python main.py transcribe audio.flac --model large-v2# Verify your setup
python main.py --help
# Check specific command options
python main.py cleanup --help# Always use absolute paths for reliability
python main.py process /full/path/to/audio/ --output-dir /full/path/to/output/# Verify your replacements file format
echo '{"CorrectName": ["mishear1", "mishear2"]}' > merge_replacements.json
# Check file location (should be in output directory)
python main.py replace --replacements /full/path/to/merge_replacements.json| Command | Purpose | Example |
|---|---|---|
process |
Full automation pipeline | python main.py process audio/ --output-dir session --all-steps |
transcribe |
Audio to text conversion | python main.py transcribe audio.flac --output-dir transcripts |
cleanup |
Process transcript files | python main.py cleanup --base-path transcripts |
replace |
Apply text corrections | python main.py replace --input transcript.txt |
python main.py process INPUT_PATH --output-dir OUTPUT_DIR [options]
Options:
--all-steps Run transcribe → cleanup → replace
--transcribe-only Only convert audio to text
--cleanup-only Only process existing transcripts
--session-name NAME Campaign/session identifier
--session-part PART Episode/part identifier
--model MODEL Whisper model (tiny to turbo)
--no-fp16 Use CPU-only processingpython main.py transcribe INPUT_PATH --output-dir OUTPUT_DIR [options]
Options:
--model MODEL Whisper model: tiny, base, small, medium, large, large-v2, turbo
--no-fp16 Disable fp16 (required for CPU-only)
--language LANG Audio language (default: en)
--config-file FILE Custom Whisper configurationpython main.py cleanup --base-path PATH [options]
Options:
--session-name NAME Override session name
--part PART Override session part
--skip-duplicates Disable duplicate removal
--skip-merge Disable segment merging
--skip-short Disable short text removal
--skip-gibberish Disable silence/gibberish removal- Audio:
.flac,.wav,.mp3(Craig Discord bot output recommended) - Transcripts:
.tsvfiles withstart,end,textcolumns (from Whisper)
- Main Config:
config.jsonwith cleanup settings and name mappings - Text Corrections:
merge_replacements.jsonfor fixing misheard terms{ "Gandalf": ["gandolf", "gandulf", "gand off"], "PlayerName": ["playername", "player name"] }
- Individual CSVs:
*_processed.csv(cleaned per-speaker data) - Combined CSV:
*_merged.csv(chronological speaker data) - Final Transcript:
Session_*_Final_COMPLETE.txt(readable transcript) - Split Parts:
Session_*_part_N.txt(if transcript is very long)
- Input: Audio files or TSV transcripts (one per speaker)
- Individual Processing: Remove duplicates, merge adjacent segments, clean short text
- Merge: Combine all speakers in chronological order
- Text Replacement: Apply name/term corrections from JSON file
- Output: Generate readable transcripts and organized data files
project/
├── main.py # Main CLI entry point
├── cli/ # Command implementations
├── transcribe/ # Audio transcription (Whisper)
├── transcript_cleanup/ # Text processing and cleanup
├── shared_utils/ # Common configuration and utilities
├── AI_Prompts/ # Campaign management templates
└── tests/ # Test suite
- Audio Processing: OpenAI Whisper for transcription
- Text Processing: pandas for data manipulation
- Configuration: JSON files + environment variables
- Logging: Structured, colored output for progress tracking
- Testing: pytest with comprehensive unit tests
For users who prefer the original scripts:
# Audio transcription
python transcribe/whisper_transcribe.py audio.flac --output-dir transcripts
# Transcript processing
python transcript_cleanup/transcript_cleanup_v2.py --base-path transcripts
# Text replacement
python transcript_cleanup/json_text_replace_v2.py --input transcript.txtpytest tests/ -v
python tests/run_tests.py- Phase 1: Shared utilities, professional logging, unified configuration
- Phase 2: CLI interface, pipeline automation, configurable processing
- Maintained with KISS principles: simple, maintainable improvements without over-engineering
- Docker containerization for consistent deployments
- Enhanced error recovery and resume capabilities
- Performance optimizations for large audio files
- Extended AI prompt templates and automation