Hecate2/sukasuka-vocal-dataset-builder
すかすかアニメボカロデータセット。1st anime vocal dataset. Extract audio (vocal) files from video based on .ass subtitle files; manually label vocal files to characters. Will be used for PITS/VITS/Diffusion text-to-speech/SVC. 根据字幕,从视频里抽取全部语音,然后手动按角色标注。
This tool helps animators, voice actors, and AI voice developers create custom voice datasets from anime videos. It takes video files and accompanying subtitle files as input. It then extracts individual vocal segments, allowing you to manually label each segment with the character who is speaking. The output is a structured dataset of character-specific vocal audio files, ready for training text-to-speech or voice conversion models.
Use this if you need to build a specialized dataset of character voices from anime or drama CDs for training AI voice models.
Not ideal if you're not familiar with the anime characters or lack the patience for manual audio labeling and data organization.
Stars
49
Forks
4
Language
Python
License
MIT
Category
Last pushed
Feb 25, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Hecate2/sukasuka-vocal-dataset-builder"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hetpandya/youtube_tts_data_generator
A python library to generate speech dataset from Youtube videos
IS2AI/Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis...
taresh18/TTSizer
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨
youmebangbang/TTS-dataset-tools
Automatically generates TTS dataset using audio and associated text. Make cuts under a custom...
souvikg544/TTS_Data_Maker
Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio...