juanmc2005/diart

A python package to build AI-powered real-time audio applications

/ 100

Established

This tool helps you automatically identify who is speaking in real-time audio, whether from a recorded conversation or a live microphone feed. It takes an audio input and outputs a detailed record of who spoke when, useful for generating speaker-labeled transcripts or analyzing conversational dynamics. It's designed for anyone needing to pinpoint individual speakers in multi-person audio.

1,944 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to accurately track and label different speakers in an audio stream as it happens, for tasks like live transcription or meeting analysis.

Not ideal if you only need to detect speech presence without differentiating between speakers, or if you're not comfortable with some command-line interaction.

audio-transcription meeting-analysis conversation-logging speech-processing podcast-production

Stale 6m

Maintenance 0 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 19 / 25

How are scores calculated?

Stars

1,944

Forks

159

Language

Python

License

MIT

Related frameworks

felixbur/nkululeko

Machine learning speaker characteristics

claritychallenge/clarity

Clarity Challenge toolkit - software for building Clarity Challenge systems

astorfi/3D-convolutional-speaker-recognition

:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

wq2012/awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

hitachi-speech/EEND

End-to-End Neural Diarization

Explore ML Frameworks

All categories Trending ML Framework directory Insights