vislupus/Bulgarian-TTS-dataset
LibriVox dataset for Bulgarian language TTS
This project provides a collection of Bulgarian and English audio clips with corresponding text transcriptions. It's designed for someone building or improving a text-to-speech system for the Bulgarian language. You input written Bulgarian text, and the system learns to output natural-sounding spoken Bulgarian audio. This is ideal for developers creating voice assistants, audiobooks, or language learning tools for a Bulgarian-speaking audience.
No commits in the last 6 months.
Use this if you need a pre-processed dataset of spoken Bulgarian audio and text to train a text-to-speech (TTS) model.
Not ideal if you are looking for a ready-to-use text-to-speech application or if you need diverse speakers for your voice synthesis project.
Stars
8
Forks
—
Language
—
License
MIT
Category
Last pushed
Aug 18, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/vislupus/Bulgarian-TTS-dataset"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ynop/audiomate
Python library for handling audio datasets.
reazon-research/ReazonSpeech
Massive open Japanese speech corpus
common-voice/cv-dataset
Metadata and versioning details for the Common Voice dataset
davidmartinrius/speech-dataset-generator
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset...
EgorLakomkin/KTSpeechCrawler
Automatically constructing corpus for automatic speech recognition from YouTube videos