youmebangbang/TTS-dataset-tools
Automatically generates TTS dataset using audio and associated text. Make cuts under a custom length. Uses Google Speech to text API to perform diarization and transcription or aeneas to force align text to audio.
This tool helps podcasters, educators, or content creators automatically prepare audio for text-to-speech (TTS) models. You provide audio files and their corresponding text, and it generates segmented audio clips with precise text alignments. The primary users are individuals who need to create custom, high-quality TTS datasets efficiently.
No commits in the last 6 months.
Use this if you have long audio recordings and want to automatically split them into short, transcribed segments perfect for training custom text-to-speech voices.
Not ideal if you need to process extremely large batches of files without any manual review or if you prefer not to use cloud transcription services.
Stars
52
Forks
17
Language
Python
License
MIT
Category
Last pushed
Apr 17, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/youmebangbang/TTS-dataset-tools"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hetpandya/youtube_tts_data_generator
A python library to generate speech dataset from Youtube videos
IS2AI/Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis...
taresh18/TTSizer
ποΈ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets β¨
Hecate2/sukasuka-vocal-dataset-builder
γγγγγ’γγ‘γγ«γγγΌγΏγ»γγγ1st anime vocal dataset. Extract audio (vocal) files from video based on .ass...
souvikg544/TTS_Data_Maker
Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio...