IIP-Sogang/olkavs-avspeech

The Introduction of the OLKAVS Dataset

25
/ 100
Experimental

This project offers tools to process and evaluate a large dataset of Korean audio-visual speech. It takes raw video and audio recordings of Korean speakers, along with their transcriptions, and processes them to generate structured data suitable for training and evaluating speech recognition models. Researchers and developers working on speech-related AI, particularly those focusing on lip-reading or robust speech recognition in varied conditions, would find this useful.

No commits in the last 6 months.

Use this if you need a pre-processed dataset and evaluation scripts for developing or testing audio-visual speech recognition models specifically for the Korean language.

Not ideal if you are looking for a ready-to-use speech recognition application, as this project focuses on data preparation and model evaluation.

audio-visual speech recognition lip reading Korean language processing speech technology research AI model training data
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 10 / 25

How are scores calculated?

Stars

37

Forks

4

Language

Python

License

Last pushed

May 28, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/IIP-Sogang/olkavs-avspeech"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.