Please support the original RVC, without it, this inference wont be possible to make.
- Support V1 & V2 Model ✅
- Youtube Audio Downloader ✅
- Demucs (Voice Splitter) [Internet required for downloading model] ✅
- TTS Support ✅
- Microphone Support ✅
- HuggingFace Spaces Inference [for CPU Tier only] ✅
- Remove Youtube & Input Path ✅
- Remove Crepe Support due to gpu requirement ✅
Install ffmpeg first before running these command.
- Windows
Run the
start.batto download the model and dependencies.
Run therun.batto run the inference - MacOS & Linux
For MacOS. before running the script, please install wget
Run thestart.shto download the model and dependencies.
Run therun.shto run the inference
-
Install Python 3.10 (Cannot use Python 3.11 or higher, due to unmaintained fairseq dependency)
-
Install Pytorch
- CPU only (any OS)
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cpu
- Nvidia (CUDA used)
# For Windows (Due to flashv2 not supported in windows, Issue: https://github.com/Dao-AILab/flash-attention/issues/345#issuecomment-1747473481) pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu118 # Other (Linux, etc) pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
-
Install ffmpeg
-
Install Dependencies
pip install -r requirements.txt- Download Pre-model
# Hubert Model
https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/hubert_base.pt
# Save it to /assets/hubert/hubert_base.pt
# RVMPE (rmvpe pitch extraction, Optional)
https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.pt
# Save it to /assets/rvmpe/rmvpe.pt- Run WebUI
python app.py