Skip to content

A self-hosted ElevenLabs clone for text-to-speech, voice conversion, and AI audio generation with Docker, FastAPI, and Next.js. πŸ”ŠπŸŽ™οΈπŸ’‘πŸ’»

License

Notifications You must be signed in to change notification settings

BernieTv/ElevenLabs-Clone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Project Thumbnail

πŸŽ™οΈ ElevenLabs Clone – Self-Hosted AI Audio Studio

πŸš€ Overview

This is a full-stack, self-hosted clone of ElevenLabs β€” your all-in-one AI audio generation playground. πŸ”₯ Instead of relying on external APIs, we host our own cutting-edge models for:

  • πŸ”Š Text-to-Speech (TTS) with StyleTTS2
  • 🎭 Voice Conversion with Seed-VC
  • 🎡 Text-to-Audio with Make-An-Audio

All models are fine-tuned for custom voices, containerized via Docker 🐳, and exposed through blazing-fast FastAPI endpoints ⚑. The frontend is powered by Next.js and the T3 Stack, offering a modern, responsive UI with voice selection, audio history, and full user management. Auth.js handles authentication, credits are managed dynamically, and Inngest keeps your AI infra from getting overwhelmed πŸ›‘οΈ.


✨ Features at a Glance

  • πŸ”Š StyleTTS2 for lifelike text-to-speech
  • 🎭 Seed-VC for seamless voice cloning
  • 🎡 Make-An-Audio for creative audio generation
  • 🧠 Fine-tuning for unique voice identities
  • 🐳 Dockerized AI stack for easy deployment
  • βš™οΈ FastAPI backend with scalable endpoints
  • πŸͺ™ User credit system
  • πŸŒ€ Inngest queue to manage workload
  • ☁️ AWS S3 integration for audio file storage
  • πŸ‘₯ Multiple pre-trained voice models
  • πŸ–₯️ Fully responsive UI with Next.js + Tailwind CSS
  • πŸ” Secure authentication with Auth.js
  • πŸŽ›οΈ Voice picker component
  • πŸ“ Audio history tracking

🧠 Models Used

Purpose Model Name
Voice-to-Voice seed-vc
Fine-tuned TTS StyleTTS2FineTune
Text-to-Speech StyleTTS2
Text-to-SFX / Audio Make-an-audio

πŸ› οΈ Setup Instructions

1️⃣ Clone the Repository

git clone https://github.com/BernieTv/ElevenLabs-Clone.git

2️⃣ Navigate to Project Directory

cd elevenlabs-clone

3️⃣ Install Python 🐍

Ensure Python 3.10 is installed. If not, download it here:
πŸ‘‰ Download Python

Note: Create a virtual environment for each model folder except elevenlabs-clone-frontend.


πŸ“¦ Install Dependencies

➀ Frontend (Next.js)

cd elevenlabs-clone-frontend
npm install

➀ AI Model Folders (Repeat for each)

cd seed-vc  # example
pip install -r requirements.txt

πŸ” AWS IAM Setup

You'll need two IAM entities to handle S3 and EC2 integration:

1️⃣ User: styletts2-api

Purpose: Upload & fetch audio files from S3

Custom Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:PutObject", "s3:GetObject", "s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::elevenlabs-clone",
        "arn:aws:s3:::elevenlabs-clone/*"
      ]
    }
  ]
}

2️⃣ Role: elevenlabs-clone-ec2

Purpose: EC2 access to S3 + ECR

Attach Permissions:

  • AmazonEC2ContainerRegistryFullAccess
  • AmazonS3FullAccess

Custom Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:PutObject", "s3:GetObject", "s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::elevenlabs-clone",
        "arn:aws:s3:::elevenlabs-clone/*"
      ]
    }
  ]
}

About

A self-hosted ElevenLabs clone for text-to-speech, voice conversion, and AI audio generation with Docker, FastAPI, and Next.js. πŸ”ŠπŸŽ™οΈπŸ’‘πŸ’»

Topics

Resources

License

Stars

Watchers

Forks