microsoft/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
This project offers advanced pre-trained models that can understand and generate spoken language. It takes spoken audio or text as input and produces transcribed text, synthesized speech, or translations into other languages. This is ideal for developers creating applications that interact with users through speech, such as virtual assistants or transcription services.
1,435 stars. No commits in the last 6 months.
Use this if you are a developer building applications that require high-quality speech recognition, text-to-speech, or speech translation capabilities.
Not ideal if you are looking for an out-of-the-box, no-code solution for end-user tasks without any development work.
Stars
1,435
Forks
135
Language
Python
License
MIT
Category
Last pushed
Apr 24, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/microsoft/SpeechT5"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Spr-Aachen/Easy-Voice-Toolkit
A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
PrzemyslawSwiderski/python-gradle-plugin
Gradle plugin to run Python projects.
alphacep/awesome-russian-speech
Russian speech technology links
ftyers/commonvoice-utils
Linguistic processing for Common Voice
microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech