GeorgeEfstathiadis/LLM-Diarize-ASR-Agnostic
Repository for "LLM-based speaker diarization correction: A generalizable approach" paper
This project helps machine learning engineers and researchers improve the accuracy of speaker diarization in audio transcripts. It takes raw audio transcripts, optionally from services like AWS Transcribe or Google Speech-to-Text, along with a reference transcript, and outputs corrected speaker labels. The primary users are individuals working on speech processing applications where precise speaker identification is crucial.
No commits in the last 6 months.
Use this if you need to fine-tune a Large Language Model (LLM) to correct speaker diarization errors in ASR transcripts and evaluate its performance.
Not ideal if you are a non-developer seeking an out-of-the-box solution for speaker diarization without needing to train or deploy machine learning models.
Stars
21
Forks
—
Language
Jupyter Notebook
License
—
Category
Last pushed
Jul 31, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/GeorgeEfstathiadis/LLM-Diarize-ASR-Agnostic"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.