mubingshen/MLC-SLM-Baseline

The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-SLM) to provide participants with baseline systems for speech recognition and speaker diarization in multilingual conversational scenario.

30
/ 100
Emerging

This project offers foundational models for automatic speech recognition (ASR) and speaker diarization in complex, multilingual conversations. It takes raw audio recordings of natural, multi-speaker dialogue and processes them to produce highly accurate transcriptions and identify who spoke when. Researchers and developers building advanced AI systems for human-computer interaction, especially in spoken dialogue, would find this useful.

No commits in the last 6 months.

Use this if you are developing or benchmarking systems that need to accurately transcribe and identify speakers in multilingual, real-world conversational audio, including overlaps and interruptions.

Not ideal if your primary need is for simple, single-speaker speech-to-text without the complexities of diarization or multilingual conversational nuances.

spoken-dialogue-systems multilingual-speech-recognition speaker-diarization conversational-AI natural-language-processing
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 8 / 25
Maturity 8 / 25
Community 12 / 25

How are scores calculated?

Stars

50

Forks

6

Language

Python

License

Last pushed

May 14, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/mubingshen/MLC-SLM-Baseline"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.