Configuration Parsing Warning:Invalid JSON for config file config.json

๐ŸŽ™๏ธ Flare-TTS v1.5 28M

Welcome to Flare-TTS v1.5 28M, an open-source text-to-speech model with 28 million parameters trained on LJSpeech.
This is an improved version of Flare-TTS 28M (v1) which is now using a vocoder to remove these robotic sounds!

Quality and results

This model has a much better quality now, it doesn't sound robotish anymore and you can clearly understand what the model says.
Example:

Training process

We trained the vocoder for 72 epochs on a single A6000 GPU for ~10 hours. Note that this model is based on the first version Flare-TTS 28M. Furthermore, this model now uses a vocoder - see train_vocoder.py for more information and the full code. The full training code for the vocoder can be found in this repo as prepare.sh and train_vocoder.py.
The full pretraining code is here: https://huggingface.co/LH-Tech-AI/Flare-TTS-28M/tree/main

Architecture

This model was trained using CoquiTTS. For the architecture we chose GlowTTS.

Training dataset

We trained on the full LJSpeech dataset. Thanks to keithito for this :-)

How to use

As soon as you have the model checkpoint (model.pth) and config.json on your device, you can generate a sample using:

tts --text "Hello, world! This is the second version of Flare-TTS - now with a vocoder. The robot sounds are finally gone!" \
    --model_path ./model.pth \
    --config_path ./config.json \
    --vocoder_path ./vocoder_15000_checkpoint.pth \
    --vocoder_config_path ./vocoder_config.json \
    --out_path output_1.wav

Final thoughts

This model is much better in the audio quality than the first version of Flare-TTS 28M.
But stay tuned for a third version with more features! :D

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Dataset used to train CompactAI-O/Flare-TTS-v1.5

Collection including CompactAI-O/Flare-TTS-v1.5