yc9701/pansori

Tools for ASR Corpus Generation from Online Video

46
/ 100
Emerging

This tool helps researchers, linguists, or educators create high-quality datasets for training Automatic Speech Recognition (ASR) models. It takes online videos with existing audio and subtitle tracks, processes them to accurately align spoken words with their text, and then cleans the data. The output is a refined collection of audio clips paired with their corresponding text, ready for ASR model training.

140 stars. No commits in the last 6 months.

Use this if you need to build a specialized speech corpus from online video content, especially for languages where existing ASR training data is scarce.

Not ideal if you need to transcribe live audio, process audio without any accompanying subtitle data, or if you prefer not to use cloud-based ASR services for validation.

speech-research language-technology linguistics machine-learning-datasets educational-content
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

140

Forks

27

Language

Python

License

MIT

Last pushed

Feb 10, 2019

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/yc9701/pansori"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.