microsoft/SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

/ 100

Emerging

This project offers advanced pre-trained models that can understand and generate spoken language. It takes spoken audio or text as input and produces transcribed text, synthesized speech, or translations into other languages. This is ideal for developers creating applications that interact with users through speech, such as virtual assistants or transcription services.

1,435 stars. No commits in the last 6 months.

Use this if you are a developer building applications that require high-quality speech recognition, text-to-speech, or speech translation capabilities.

Not ideal if you are looking for an out-of-the-box, no-code solution for end-user tasks without any development work.

speech-recognition text-to-speech speech-translation natural-language-processing audio-processing

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

1,435

Forks

135

Language

Python

License

MIT

Higher-rated alternatives

Spr-Aachen/Easy-Voice-Toolkit

A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.

PrzemyslawSwiderski/python-gradle-plugin

Gradle plugin to run Python projects.

alphacep/awesome-russian-speech

Russian speech technology links

ftyers/commonvoice-utils

Linguistic processing for Common Voice

microsoft/UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Explore Voice AI Tools

All categories Trending Voice AI directory Insights