gokhaneraslan/tts-dataset-generator
With this tool you can create custom TTS dataset from video or audio.
This tool helps you turn long audio or video recordings into neatly organized datasets for training custom text-to-speech (TTS) voices. You input raw audio or video files, and it automatically breaks them into speech segments, transcribes them using AI, and outputs properly formatted audio clips and a text file of aligned transcripts. It's perfect for voice actors, linguists, or educators who need to create custom voice models from their own recordings.
No commits in the last 6 months.
Use this if you need to create a high-quality, segmented, and transcribed dataset from audio or video files to train a custom text-to-speech voice or for large-scale transcription.
Not ideal if you only need a quick transcription of a short audio file without the need for segmentation or dataset formatting for voice model training.
Stars
13
Forks
5
Language
Python
License
Apache-2.0
Category
Last pushed
Jun 07, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/gokhaneraslan/tts-dataset-generator"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hetpandya/youtube_tts_data_generator
A python library to generate speech dataset from Youtube videos
IS2AI/Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis...
taresh18/TTSizer
ποΈ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets β¨
Hecate2/sukasuka-vocal-dataset-builder
γγγγγ’γγ‘γγ«γγγΌγΏγ»γγγ1st anime vocal dataset. Extract audio (vocal) files from video based on .ass...
youmebangbang/TTS-dataset-tools
Automatically generates TTS dataset using audio and associated text. Make cuts under a custom...