NTIA/alignnet

Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.

/ 100

Experimental

This tool helps speech and audio researchers develop more accurate algorithms for automatically assessing speech quality. It takes multiple audio datasets, which might use different scoring scales, and trains a 'no-reference' model to produce a consistent quality score, even if those datasets weren't originally designed to work together. The output is a robust speech quality estimator.

No commits in the last 6 months.

Use this if you need to combine several independent datasets of speech audio with subjective quality ratings to train a single, reliable speech quality estimation model.

Not ideal if you only have a single, perfectly consistent dataset for training, or if you are not working with no-reference speech quality estimation.

speech-quality-assessment audio-research machine-listening speech-enhancement perceptual-audio-evaluation

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

voicepaw/so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

sarulab-speech/UTMOSv2

UTokyo-SaruLab MOS Prediction System

ssmall256/mlx-audio-io

Native audio I/O for MLX on macOS and Linux

ssmall256/mlx-spectro

High-performance STFT/iSTFT for Apple MLX with fused Metal kernels and autograd support

daniilrobnikov/vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Explore ML Frameworks

All categories Trending ML Framework directory Insights