revsic/speechset

Numpy-librosa implementation of Speech dataset pipeline

/ 100

Emerging

This project helps speech researchers and machine learning practitioners prepare audio data for training speech recognition or synthesis models. It takes raw audio files and their corresponding text transcripts as input, then processes them into a structured dataset ready for model training. This is ideal for anyone working with spoken language data to build AI speech applications.

No commits in the last 6 months.

Use this if you need to standardize and process raw audio and text data into a consistent format for your speech-related machine learning projects.

Not ideal if you're looking for a tool to perform speech recognition or synthesis directly, as this focuses solely on dataset preparation.

speech-recognition-datasets speech-synthesis-datasets audio-data-preparation machine-learning-datasets spoken-language-processing

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

hetpandya/youtube_tts_data_generator

A python library to generate speech dataset from Youtube videos

IS2AI/Kazakh_TTS

An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis...

taresh18/TTSizer

🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨

Hecate2/sukasuka-vocal-dataset-builder

すかすかアニメボカロデータセット。1st anime vocal dataset. Extract audio (vocal) files from video based on .ass...

youmebangbang/TTS-dataset-tools

Automatically generates TTS dataset using audio and associated text. Make cuts under a custom...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights