wq2012/VB_diarization

VB Diarization with Eigenvoice and HMM Priors, refactored

/ 100

Experimental

This tool helps researchers and analysts automatically identify 'who spoke when' in audio recordings. It takes raw audio as input and outputs a timeline indicating speech segments attributed to different speakers. Anyone working with audio data, such as speech scientists, linguists, or media analysts, who needs to segment conversations by speaker would find this useful.

No commits in the last 6 months.

Use this if you need to automatically distinguish between multiple speakers in an audio recording, assigning each speech segment to the correct person.

Not ideal if you need to identify *who* the specific speakers are (e.g., 'John' vs 'Jane'), as it only differentiates between 'Speaker 1,' 'Speaker 2,' etc.

speech-analysis audio-transcription conversation-analysis linguistics sound-processing

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with...

k2-fsa/sherpa

Speech-to-text server framework with next-gen Kaldi

Picovoice/cheetah

On-device streaming speech-to-text engine powered by deep learning

yeyupiaoling/YeAudio

Python的音频工具

zaigie/FunSpeech

开箱即用的本地私有化部署语音服务，快速搭建FunASR与CosyVoice2/3后端

Explore Voice AI Tools

All categories Trending Voice AI directory Insights