FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

/ 100

Established

This project helps professionals analyze spoken language by taking audio files in over 50 languages and providing text transcripts, identifying the language spoken, recognizing emotions in speech, and detecting specific audio events like laughter or applause. It's designed for anyone needing to quickly understand what's said and how it's said in multilingual audio content, offering fast and accurate results.

7,691 stars.

Use this if you need to quickly and accurately transcribe audio, identify spoken languages, recognize emotions, or detect specific sounds within multilingual recordings.

Not ideal if your primary need is highly specialized environmental sound classification beyond common human-computer interaction events.

audio-transcription call-center-analytics multilingual-analysis speech-emotion-detection media-monitoring

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

7,691

Forks

708

Language

Python

License

—

Related tools

travisvn/chatterbox-tts-api

Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling users to generate...

FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment...

fishaudio/Bert-VITS2

vits2 backbone with multilingual-bert

sfortis/openai_tts

Custom TTS component for Home Assistant. Utilizes the OpenAI speech engine or any compatible...

OpenMOSS/MOSS-TTSD

MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis....

Explore Voice AI Tools

All categories Trending Voice AI directory Insights