crlandsc/torch-log-wmse

logWMSE, an audio quality metric & loss function with support for digital silence target. Useful for training and evaluating audio source separation systems.

/ 100

Emerging

This tool helps audio engineers and researchers evaluate and improve audio source separation and denoising models. It takes unprocessed, processed, and target audio files as input, and outputs a quality score that accurately reflects human perception, especially for segments containing digital silence. It's designed for anyone working on machine learning models that process audio.

Available on PyPI.

Use this if you are training or evaluating audio processing models, particularly for source separation or denoising, and need a robust quality metric that handles digital silence and aligns with human hearing.

Not ideal if you need a metric that is invariant to arbitrary scaling or polarity inversion, or if you require a full model of human auditory perception like auditory masking.

audio-processing sound-engineering machine-learning-audio audio-quality-assessment speech-enhancement

Maintenance 10 / 25

Adoption 8 / 25

Maturity 25 / 25

Community 3 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

descriptinc/descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz...

drethage/speech-denoising-wavenet

A neural network for end-to-end speech denoising

YuanGongND/ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

iver56/torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

lmnt-com/wavegrad

A fast, high-quality neural vocoder.

Explore Voice AI Tools

All categories Trending Voice AI directory Insights