KinglittleQ/GST-Tacotron

A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

/ 100

Emerging

This project helps creators and developers generate natural-sounding speech from Chinese text, giving them control over the style and emotion of the spoken output. You input Chinese text and it synthesizes high-quality audio that can express different 'styles' (like happy, sad, or formal) even if those styles weren't explicitly labeled in the training data. This is useful for anyone creating audio content, such as voiceovers for videos, audiobooks, or interactive voice assistants.

374 stars. No commits in the last 6 months.

Use this if you need to convert Chinese text into speech with nuanced control over the vocal style, without needing to manually label specific emotions or speaking styles.

Not ideal if you primarily work with languages other than Chinese, or if you need a pre-built, production-ready speech synthesis service without any development or training overhead.

speech-synthesis voice-generation content-creation text-to-speech audio-production

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 22 / 25

How are scores calculated?

Stars

374

Forks

Language

Python

License

MIT

Compare

GST-Tacotron and tacotron

Higher-rated alternatives

bshall/Tacotron

A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

Kyubyong/dc_tts

A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

DemisEom/SpecAugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Rayhane-mamah/Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation

Kyubyong/tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Explore Voice AI Tools

All categories Trending Voice AI directory Insights