DemisEom/SpecAugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

51
/ 100
Established

When training speech recognition models, the SpecAugment tool modifies audio spectrograms to create a wider variety of training examples. It takes an existing spectrogram of an audio file and alters it by warping the time axis, masking frequency blocks, and masking time segments. This helps speech AI developers make their models more robust to variations in speech.

656 stars. No commits in the last 6 months.

Use this if you are developing machine learning models for speech recognition and need to augment your audio training data to improve model performance and generalization.

Not ideal if you are looking for a general-purpose audio editing tool or a way to analyze raw audio files directly without processing them into spectrograms.

speech-recognition audio-processing machine-learning-training data-augmentation AI-development
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 25 / 25

How are scores calculated?

Stars

656

Forks

135

Language

Python

License

Apache-2.0

Last pushed

Apr 05, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/DemisEom/SpecAugment"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.