Web interface for Microsoft Sam & friends written in C & D (vibe-d), runnable on headless linux. Plus an Android version with iOS-style GUI implemented in Kotlin.
- Install Microsoft Speech SDK 4.0 (
SAPI4SDK.exe) - Run
build.bat. If you have Visual Studio older than 2022, change vcvars32 path - Install
ldc2anddub - Go to SAPI4_web and compile the web server:
dub --compiler=ldc2 --arch=x86 --build=release
- Install
wine(sudo apt install wine), 1.8.7 is fine. If wine doesn't work on your system, you must stop here - In a VNC/RDP session or X11-forwarded SSH connection:
- Install Microsoft Speech 4.0 API in the wine environment:
wine spchapi.exe - Install Lernout & Hauspie TruVoice Amer. Eng. TTS Engine in the wine environment:
wine tv_enua.exe
- Install Microsoft Speech 4.0 API in the wine environment:
- Move:
public(static web assets)sapi4.exe(web server)sapi4.dll(SAPI4 voice audio generation library)sapi4limits.exe(SAPI4 voice enumerator)sapi4out.exe(SAPI4 voice audio generation program) to a new empty folder
- Install
xvfb(apt install xvfb) - Run web server:
while true; do; xvfb-run -a wine sapi4.exe; sleep 1; done; - Pass the web server through nginx - add this to nginx config:
location ^~ /SAPI4/ { proxy_pass http://127.0.0.1:23451/; }. Note that the web server will work only on/SAPI4/location, if you want to change that, change references to scripts and other assets inSAPI4_web/views/layout.dt,SAPI4_web/public/scripts/tts.js. - Go to
http(s)://localhost/SAPI4/, putsoi soi soi soi soi soi soi soi soi soi soi soi soi soi soi soi soi soi soi soi soias text, set speed to 450 and enjoy.
You might be familiar with Speakonia. As CFS-Technologies have released an unlimited license (http://www.cfs-technologies.com/home/) for Speakonia, you can get .wavs Microsoft Sam & other voices genereated text with Speakonia too, however web interface is more convenient and generates text much faster. Speakonia is set to generate text at real-time of speaking speed and SAPI4 server is set to generate text at x16777215 of real-time speaking speed. You can download .wavs from web interface too (right click the player and press Save audio as..., at least on Chrome).
You can generate text from an API too, endpoints are /SAPI4/VoiceLimitations?voice=(voice) and /SAPI4/SAPI4?text=(text)[&voice=(voice)][&pitch=(pitch)][&speed=(speed)]. () - required parameters, [] - optional parameters.
A modern Android implementation with an iOS-style GUI, created in Kotlin.
- Text-to-speech functionality using Android's built-in TTS engine
- iOS-style UI with rounded corners, appropriate colors, and tab-based navigation
- Voice selection from available system voices
- Adjustable pitch and speed controls
- History of previously spoken text
- Clean and modern interface
- Open Android Studio
- Select "Open an existing project" and navigate to the
android_appdirectory - Android Studio will automatically sync the project with Gradle
- Make sure your SDK and build tools are up to date
- Connect an Android device or start an emulator
- Click "Run" to build and install the app
- Enter text in the text area
- Select a voice from the dropdown
- Adjust pitch and speed using the sliders
- Click "Speak" to hear the text
- Switch to the History tab to see previous entries
- Click on history items to load them back into the form
The Android version provides:
- Cross-platform compatibility (Android)
- Modern UI with iOS-style design
- Better user experience with tab navigation
- Persistent history storage
- More intuitive controls
- Improved accessibility