LEMAS‑TTS: Multilingual Zero‑Shot TTS

LEMAS‑TTS is a multilingual zero‑shot text‑to‑speech system, supporting 10 languages:

Chinese
English
Spanish
Russian
French
German
Italian
Portuguese
Indonesian
Vietnamese

1. Installation

1.1 Environment

git clone https://github.com/LEMAS-Project/LEMAS-TTS.git
cd ./LEMAS-TTS

# create a dedicated environment
conda create -n lemas-tts python=3.10
conda activate lemas-tts

1.2 System Dependencies

you can install the system dependencies as follows or via anaconda:

sudo apt-get update
sudo apt-get install -y ffmpeg

or

conda install -c conda-forge ffmpeg

1.3 Python Dependencies

pip install -r requirements.txt

# or, if you package it locally:
# pip install -e .

(Install PyTorch + Torchaudio according to your device (CUDA / ROCm / CPU / MPS), following the official PyTorch instructions.)

1.4 Download Pretrained Models

Download the pretrained models from https://huggingface.co/LEMAS-Project/LEMAS-TTS

Then place the pretrained_models/ folder next to the lemas_tts/ package root; the code locates the repo root by looking for this folder.

2. Usage

All commands below assume:

cd ./LEMAS-TTS
export PYTHONPATH="$PWD:${PYTHONPATH}"

2.1. Gradio Web UI (Multilingual Zero‑Shot TTS)

You can try the model via our Hugging Face space: https://huggingface.co/spaces/LEMAS-Project/LEMAS-TTS

Locally, you can run the Gradio web app with:

python lemas_tts/scripts/inference_gradio.py

You can customize host/port and sharing:

python lemas_tts/scripts/inference_gradio.py --host 0.0.0.0 --port 7860 --share

2.2. CLI: Multilingual TTS From Text

For simple TTS (text only, without reference audio), use:

Python entry: lemas_tts.scripts.tts_multilingual
Shell helper: lemas_tts/scripts/tts_multilingual.sh

Example:

cd ./LEMAS-TTS
bash lemas_tts/scripts/tts_multilingual.sh

The shell script demonstrates how to:

Select multilingual_grl or multilingual_prosody
Point to pretrained_models/ckpts/... and pretrained_models/data/...
Choose frontend type (currently only support phone)
Configure sampling parameters: NFE steps, CFG strength, Sway, speed, etc.

Or you can call the Python module directly, following the examples in bash scripts.

You can enable UVR5 denoising on the reference audio via --denoise.

2.3. CLI: Multilingual Speech Editing

For editing a region of an utterance given word‑level alignment JSONs, use:

Python entry: lemas_tts.scripts.speech_edit_multilingual
Shell helper: lemas_tts/scripts/speech_edit_multilingual.sh

The Python script expects:

--wav_dir: directory with input *.wav files
--align_dir: directory with Azure‑style alignment JSONs
--save_dir: directory for edited outputs

Example:

cd ./LEMAS-TTS
bash lemas_tts/scripts/speech_edit_multilingual.sh

The script supports both prosody‑enabled and non‑prosody variants; see the inline comments in speech_edit_multilingual.sh for a prosody example.

3. Acknowledgements

This project builds heavily on the following open‑source works:

F5‑TTS – core model architecture and many components of the inference pipeline.
UVR5 – music source separation / vocal denoising, used here as an optional pre‑processing step.

If you use LEMAS‑TTS in your work, please also consider citing and acknowledging these upstream projects.

4. Citation

@article{zhao2026lemas,
  title={LEMAS: A 150K-Hour Large-scale Extensible Multilingual Audio Suite with Generative Speech Models},
  author={Zhao, Zhiyuan and Lin, Lijian and Zhu, Ye and Xie, Kai and Liu, Yunfei and Li, Yu},
  journal={arXiv preprint arXiv:2601.04233},
  year={2026}
}

5. License

This repository is released under the CC‑BY‑NC‑4.0 license.
See https://creativecommons.org/licenses/by-nc/4.0/ for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
lemas_tts		lemas_tts
uvr5		uvr5
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LEMAS‑TTS: Multilingual Zero‑Shot TTS

1. Installation

1.1 Environment

1.2 System Dependencies

1.3 Python Dependencies

1.4 Download Pretrained Models

2. Usage

2.1. Gradio Web UI (Multilingual Zero‑Shot TTS)

2.2. CLI: Multilingual TTS From Text

2.3. CLI: Multilingual Speech Editing

3. Acknowledgements

4. Citation

5. License

About

Uh oh!

Releases

Packages

Languages

LEMAS-Project/LEMAS-TTS

Folders and files

Latest commit

History

Repository files navigation

LEMAS‑TTS: Multilingual Zero‑Shot TTS

1. Installation

1.1 Environment

1.2 System Dependencies

1.3 Python Dependencies

1.4 Download Pretrained Models

2. Usage

2.1. Gradio Web UI (Multilingual Zero‑Shot TTS)

2.2. CLI: Multilingual TTS From Text

2.3. CLI: Multilingual Speech Editing

3. Acknowledgements

4. Citation

5. License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages