whisperX and whisper-run
WhisperX is a mature, widely-adopted framework that combines Whisper ASR with word-level alignment and diarization, while whisper-run is a lightweight wrapper around Faster Whisper that adds diarization as a simpler alternative for users prioritizing speed over the comprehensive timestamp precision that WhisperX provides.
About whisperX
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
This tool helps you accurately transcribe audio recordings, providing not just the words but also precise timestamps for each word. It can also identify who is speaking at any given time, separating conversations by speaker. Anyone who needs highly accurate transcripts for audio analysis, subtitling, or content review would find this useful, such as researchers, journalists, or content creators.
About whisper-run
gorkemkaramolla/whisper-run
Faster Whisper with Speaker Diarization
Quickly and accurately transcribe audio recordings and identify who said what, even with multiple speakers. It takes an audio file as input and produces a detailed JSON file showing the text spoken and the exact speaker for each segment. This is for anyone who needs to convert spoken content from interviews, meetings, or podcasts into a structured, readable format.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work