chimechallenge/chime-utils

Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.

35
/ 100
Emerging

This tool helps researchers and engineers working on distant automatic speech recognition (DASR) challenges. It streamlines the preparation of large speech datasets, like CHiME-6, DiPCo, and Mixer 6 Speech, by downloading and organizing them into a unified structure. The output is a ready-to-use dataset, formatted for popular speech processing toolkits, along with official scoring scripts to evaluate DASR system performance.

No commits in the last 6 months.

Use this if you are participating in the CHiME-8 DASR challenge or developing speech recognition systems for noisy, multi-speaker environments and need standardized data and evaluation tools.

Not ideal if you are working on general-purpose speech recognition outside of the CHiME challenge context or only need to process small, custom speech datasets.

Distant Automatic Speech Recognition speech dataset preparation speech processing ASR evaluation audio data management
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 13 / 25

How are scores calculated?

Stars

24

Forks

4

Language

Python

License

MIT

Last pushed

Feb 25, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/chimechallenge/chime-utils"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.