The voice format is <TTS System>:<Voice Name>[#<Speaker ID>. The Speaker ID is optional for multi speakers model only.
The following TTS Engines used:
- Coqui-TTS: Patched and embedded version of Coqui-TTS latest dev(0.7.0) version
- TTS System name:
tts - Voice quality: Good
- Performance: Not Good, You need a powerful CPU and enough memory
- Resource overhead: High
- Builtin Voice Models:
zh_baker: Chinese Voice from baker [F]en_vctk: English Multi Speakers Voice [MF]
- TTS System name:
- ESpeaker
- TTS System name:
espeak - Voice quality: Bad, like robotic.
- Performance: Very Good
- Resource overhead: Low
- Builtin Voice Models:
en-029: English_(Caribbean) [M]en-gb: English_(Great_Britain) [M]en-gb-scotland: English_(Scotland) [M]en-gb-x-gbclan: English_(Lancaster) [M]en-gb-x-gbcwmd: English_(West_Midlands) [M]en-gb-x-rp: English_(Received_Pronunciation) [M]en-us: English_(America) [M]zh-cmn: Chinese_(Mandarin) [M]zh-yue: Chinese_(Cantonese) [M]
- TTS System name:
If your input text begins with a left angle bracket (<) character, it will be interpreted as SSML.
A subset of SSML is supported:
<speak>- wrap around SSML textlang- set language for document
<s>- sentence (disables automatic sentence breaking)lang- set language for sentence
<w>/<token>- word (disables automatic tokenization)<voice name="...">- set voice of inner textvoice- name or language of voice- Name format is
tts:voice(e.g., "glow-speak:en-us_mary_ann") ortts:voice#speaker_id(e.g., "coqui-tts:en_vctk#p228") - If one of the supported languages, a preferred voice is used (override with
--preferred-voice <lang> <voice>)
- Name format is
<say-as interpret-as="">- force interpretation of inner textinterpret-asone of "spell-out", "date", "number", "time", or "currency"format- way to format text depending oninterpret-as- number - one of "cardinal", "ordinal", "digits", "year"
- date - string with "d" (cardinal day), "o" (ordinal day), "m" (month), or "y" (year)
<break time="">- Pause for given amount of time- time - seconds ("123s") or milliseconds ("123ms")
<sub alias="">- substitutealiasfor inner text
eg,
<speak>
<s lang="zh">欢迎使用离线语音合成</s>
<s lang="en-us">Welcome to Offline Speech Synthesis.</s>
</speak>- Coqui-TTS
- ESpeaker
- Main Inspired by OpenTTS.
- Great Thanks. Without OpenTTS there would be no Offline TTS.
- Upgrade Coqui-TTS from 0.3.1 to latest version 0.7.0dev
- fix: Check if optional dependencies are installed before loading ZH/JA phonememizer
- Remove matplotlib (It is only useful during the train analysis phase).
- Optimal Coqui-TTS Models Size
- Optimal Coqui-TTS Models on Embedded device
- Espeak Chinese locale missing
- Show used languages only
- Can not use SSML on HA
- Can not modify options on HA for the
/data/options.jsoncannot read via common user. - Add preferred voice for language option