CouncilDataProject/speakerbox

Speakerbox: Fine-tune Audio Transformers for speaker identification.

44
/ 100
Emerging

This project helps anyone working with audio recordings that contain multiple speakers to automatically identify who is speaking and when. You provide raw audio files, and after a semi-automated annotation process, the system outputs segments of audio labeled with the speaker's identity. This is ideal for researchers, journalists, or anyone needing to analyze conversations in spoken media.

No commits in the last 6 months. Available on PyPI.

Use this if you have audio recordings with multiple known speakers and need a way to automatically label who said what and when.

Not ideal if you have recordings with a large number of unknown speakers, as it requires a dataset of known speakers for training.

audio-transcription speech-analysis media-analysis conversation-logging meeting-minutes
Stale 6m
Maintenance 0 / 25
Adoption 8 / 25
Maturity 25 / 25
Community 11 / 25

How are scores calculated?

Stars

60

Forks

6

Language

Python

License

MIT

Last pushed

Dec 01, 2024

Commits (30d)

0

Dependencies

12

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/CouncilDataProject/speakerbox"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.