FreedomIntelligence/MTalk-Bench

MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols

33
/ 100
Emerging

This tool helps researchers and developers assess how well their speech-to-speech AI models perform in realistic, multi-turn conversations. You provide your model's audio responses to a set of conversational prompts, and the benchmark evaluates them across semantic understanding, paralinguistic aspects like tone, and ambient sound interaction. It's designed for AI researchers and engineers building and refining conversational AI and large language models.

Use this if you are developing or evaluating speech-to-speech AI models and need a comprehensive way to benchmark their performance in dynamic, multi-turn dialogue scenarios.

Not ideal if you are looking for a tool to simply transcribe audio or translate speech, as its primary purpose is advanced model evaluation.

conversational-ai speech-technology ai-model-evaluation natural-language-processing large-language-models
No Package No Dependents
Maintenance 6 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 5 / 25

How are scores calculated?

Stars

18

Forks

1

Language

JavaScript

License

Apache-2.0

Last pushed

Nov 19, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/FreedomIntelligence/MTalk-Bench"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.