hetpandya/youtube_tts_data_generator

A python library to generate speech dataset from Youtube videos

49
/ 100
Emerging

This tool helps researchers and AI practitioners create custom speech datasets. You provide a list of YouTube video links, and it downloads the audio and corresponding subtitles. The output is a collection of audio clips paired with their text transcripts, formatted for training text-to-speech or speech recognition models.

No commits in the last 6 months. Available on PyPI.

Use this if you need a specific speech dataset from publicly available YouTube content to train or fine-tune an AI model.

Not ideal if you need a speech dataset from sources other than YouTube or require highly curated, studio-quality recordings.

AI-research speech-synthesis speech-recognition data-preparation natural-language-processing
Stale 6m
Maintenance 0 / 25
Adoption 7 / 25
Maturity 25 / 25
Community 17 / 25

How are scores calculated?

Stars

37

Forks

8

Language

Python

License

Apache-2.0

Last pushed

Jun 07, 2024

Commits (30d)

0

Dependencies

14

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/hetpandya/youtube_tts_data_generator"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.