HHousen/speaker-change-detection
Speaker change detection using SincNet and an LSTM/Transformer
This tool helps analyze audio recordings of conversations, meetings, or interviews. You provide an audio file, and it tells you precisely when a new person starts speaking. This is ideal for researchers, analysts, or anyone who needs to identify speaker turns without knowing who the speakers are.
No commits in the last 6 months.
Use this if you need to quickly find all the points in an audio recording where one speaker stops and another begins, without caring about the identity of the speakers.
Not ideal if you need to know *who* is speaking at any given time, or if you need to track individual speakers throughout an entire conversation (speaker diarization).
Stars
57
Forks
8
Language
Jupyter Notebook
License
GPL-3.0
Category
Last pushed
May 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/HHousen/speaker-change-detection"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
CouncilDataProject/speakerbox
Speakerbox: Fine-tune Audio Transformers for speaker identification.
CVxTz/music_genre_classification
music genre classification : LSTM vs Transformer
palonso/MAEST
Pre-training, fine-tuning, and inference code with the MAEST models for music analysis applications.
icon-lab/HST
Official implementation of Hierarchical Spectrogram Transformers (HST)
aaronstevenwhite/spectrans
Modular spectral transformer implementations in PyTorch with Fourier, wavelet, and other...