Kyubyong/specAugment
Tensor2tensor experiment with SpecAugment
This project helps speech recognition engineers improve their automatic speech recognition (ASR) models by augmenting speech data. It takes raw speech audio and applies techniques like frequency and time masking to the spectrograms. The output is a more robust ASR model, capable of better performance in real-world scenarios.
No commits in the last 6 months.
Use this if you are developing automatic speech recognition systems and want to improve your model's accuracy and generalization by making your training data more diverse.
Not ideal if you are not working with speech data or if your primary goal is not improving speech recognition model performance.
Stars
46
Forks
7
Language
Python
License
Apache-2.0
Category
Last pushed
May 13, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Kyubyong/specAugment"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
bshall/Tacotron
A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Kyubyong/dc_tts
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
DemisEom/SpecAugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
Kyubyong/tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model