inclusionAI/Ming-UniAudio
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
This project helps audio content creators and developers work with spoken audio. It takes speech input and can generate new speech, understand spoken content, or edit existing audio based on text instructions. Anyone who needs to produce, analyze, or modify speech, like podcasters, voiceover artists, or researchers, would find this useful.
435 stars.
Use this if you need to perform multiple tasks like transcribing, generating, or editing speech using simple text commands, especially for complex changes without needing to specify exact timestamps.
Not ideal if you only need a basic speech-to-text or text-to-speech tool and don't require advanced editing or combined capabilities.
Stars
435
Forks
28
Language
Python
License
MIT
Category
Last pushed
Nov 27, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/inclusionAI/Ming-UniAudio"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Spr-Aachen/Easy-Voice-Toolkit
A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
PrzemyslawSwiderski/python-gradle-plugin
Gradle plugin to run Python projects.
alphacep/awesome-russian-speech
Russian speech technology links
ftyers/commonvoice-utils
Linguistic processing for Common Voice
microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech