FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
This project helps professionals analyze spoken language by taking audio files in over 50 languages and providing text transcripts, identifying the language spoken, recognizing emotions in speech, and detecting specific audio events like laughter or applause. It's designed for anyone needing to quickly understand what's said and how it's said in multilingual audio content, offering fast and accurate results.
7,691 stars.
Use this if you need to quickly and accurately transcribe audio, identify spoken languages, recognize emotions, or detect specific sounds within multilingual recordings.
Not ideal if your primary need is highly specialized environmental sound classification beyond common human-computer interaction events.
Stars
7,691
Forks
708
Language
Python
License
—
Category
Last pushed
Dec 30, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/FunAudioLLM/SenseVoice"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
travisvn/chatterbox-tts-api
Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling users to generate...
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment...
fishaudio/Bert-VITS2
vits2 backbone with multilingual-bert
sfortis/openai_tts
Custom TTS component for Home Assistant. Utilizes the OpenAI speech engine or any compatible...
OpenMOSS/MOSS-TTSD
MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis....