jim60105/docker-whisperX
Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and test)
This tool helps you accurately transcribe audio recordings, providing not just the words spoken, but also their exact start and end times, and identifies who spoke which parts. You provide an audio file (like an MP3), and it outputs a text transcription with detailed timing and speaker labels. This is ideal for researchers, journalists, or anyone needing precise transcripts for analysis or subtitling.
422 stars.
Use this if you need to convert audio into text, complete with word-level timestamps and speaker identification for improved accuracy and utility.
Not ideal if you're looking for a simple, quick transcription without the need for speaker diarization or precise word-level timing, or if you prefer a web-based, no-setup solution.
Stars
422
Forks
49
Language
Dockerfile
License
MIT
Category
Last pushed
Mar 15, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/jim60105/docker-whisperX"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
tsmdt/whisply
💬 Fast, cross-platform CLI and GUI for batch transcription, translation, speaker annotation and...
MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
linto-ai/linto-stt
An automatic speech recognition API
linto-ai/linto-studio
Transcription and annotation interface for recorded audio or video files