Onuronon-lab/Shrutik

Open-source voice data collection platform for building inclusive voice datasets. Collaborative transcription with quality consensus. FastAPI + React + PostgreSQL.

45
/ 100
Emerging

This platform helps communities create high-quality voice datasets for underrepresented languages. Native speakers record and transcribe their voices, which the platform processes to build inclusive voice technology. This is ideal for linguistic communities, researchers, and organizations aiming to develop speech recognition systems for regional or minority languages.

Use this if you want to gather, transcribe, and validate voice recordings from a community to build speech technology for a language that current systems don't support well.

Not ideal if you need a pre-built speech recognition engine or a platform for general audio transcription unrelated to voice dataset creation.

linguistic-diversity voice-technology language-preservation crowdsourcing speech-dataset-creation
No Package No Dependents
Maintenance 10 / 25
Adoption 5 / 25
Maturity 13 / 25
Community 17 / 25

How are scores calculated?

Stars

11

Forks

8

Language

Python

License

Last pushed

Mar 10, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Onuronon-lab/Shrutik"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.