revsic/speechset

Numpy-librosa implementation of Speech dataset pipeline

37
/ 100
Emerging

This project helps speech researchers and machine learning practitioners prepare audio data for training speech recognition or synthesis models. It takes raw audio files and their corresponding text transcripts as input, then processes them into a structured dataset ready for model training. This is ideal for anyone working with spoken language data to build AI speech applications.

No commits in the last 6 months.

Use this if you need to standardize and process raw audio and text data into a consistent format for your speech-related machine learning projects.

Not ideal if you're looking for a tool to perform speech recognition or synthesis directly, as this focuses solely on dataset preparation.

speech-recognition-datasets speech-synthesis-datasets audio-data-preparation machine-learning-datasets spoken-language-processing
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 16 / 25

How are scores calculated?

Stars

9

Forks

6

Language

Python

License

MIT

Last pushed

Jan 18, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/revsic/speechset"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.