ddlBoJack/MT4SSL

[INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets

/ 100

Emerging

This project helps machine learning engineers and researchers accelerate the training of speech recognition models. It takes raw audio data and various pre-training targets as input and outputs a fine-tuned model capable of transcribing speech efficiently. This is designed for those who develop or enhance speech AI systems.

No commits in the last 6 months.

Use this if you are developing new speech recognition models and want to achieve strong performance with fewer pre-training steps and faster convergence.

Not ideal if you are looking for an off-the-shelf speech recognition application rather than a framework for model development.

speech-recognition machine-learning-engineering audio-processing AI-model-development

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

index-tts/index-tts

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

stepfun-ai/Step-Audio-EditX

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing...

lucasnewman/f5-tts-mlx

Implementation of F5-TTS in MLX

unilight/seq2seq-vc

A sequence-to-sequence voice conversion toolkit.

FireRedTeam/FireRedTTS

An Open-Sourced LLM-empowered Foundation TTS System

Explore Voice AI Tools

All categories Trending Voice AI directory Insights