stefantaubert/mel-cepstral-distance

A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral Distortion, MCD) between two inputs. This implementation is based on the method proposed by Robert F. Kubichek in "Mel-Cepstral Distance Measure for Objective Speech Quality Assessment".

/ 100

Emerging

This tool helps researchers and practitioners in speech technology objectively compare the quality of two speech audio inputs, like a reference recording and a synthesized voice. It calculates the Mel-Cepstral Distance (MCD), a metric indicating how different two sounds are, and can also provide a penalty for misaligned audio segments. Speech synthesis developers, voice conversion researchers, and anyone evaluating speech generation models would use this.

No commits in the last 6 months.

Use this if you need to quantify the perceptual difference between a generated speech audio and a reference speech audio, or compare different speech synthesis models.

Not ideal if you are looking for subjective human-perception based speech quality assessment or a tool for general audio comparison outside of speech.

speech-synthesis voice-conversion audio-quality-assessment speech-evaluation voice-AI

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

julius-speech/julius

Open-Source Large Vocabulary Continuous Speech Recognition Engine

rolczynski/Automatic-Speech-Recognition

🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)

tabahi/formantfeatures

Extract frequency, power, width and dissonance of formants from wav files

libdriver/ld3320

LD3320 full-featured driver library for general-purpose MCU and Linux.

awsaf49/audio_classification_models

Tensorflow Audio Classification Models

Explore Voice AI Tools

All categories Trending Voice AI directory Insights