freds0/kabooks
KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using audiobooks, KABooks will generate dataset with segmented audios and aligned texts.
This tool helps researchers and AI practitioners create high-quality datasets for training speech recognition and text-to-speech models. It takes an audiobook's full audio file and its corresponding text as input, then outputs segmented audio clips aligned with their exact textual transcriptions. This is ideal for those working on voice AI.
No commits in the last 6 months.
Use this if you need to quickly generate large, accurately aligned audio-text datasets from audiobooks for your AI model training.
Not ideal if your source material isn't a long-form audiobook or if you don't need highly precise audio-to-text alignments.
Stars
12
Forks
4
Language
Python
License
—
Category
Last pushed
Mar 24, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/freds0/kabooks"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
DrewThomasson/ebook2audiobook
Generate audiobooks from e-books, voice cloning & 1158+ languages!
santinic/audiblez
Generate audiobooks from e-books
mateogon/pdf-narrator
Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction,...
Finrandojin/alexandria-audiobook
AI-powered multi-voice audiobook generator — LLM script annotation, voice cloning, voice design,...
sergenes/runandread-audiobook
🚀 Open-source project for creating high-quality AI TTS-narrated audiobooks at home using models...