jim60105/docker-whisperX

Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and test)

57
/ 100
Established

This tool helps you accurately transcribe audio recordings, providing not just the words spoken, but also their exact start and end times, and identifies who spoke which parts. You provide an audio file (like an MP3), and it outputs a text transcription with detailed timing and speaker labels. This is ideal for researchers, journalists, or anyone needing precise transcripts for analysis or subtitling.

422 stars.

Use this if you need to convert audio into text, complete with word-level timestamps and speaker identification for improved accuracy and utility.

Not ideal if you're looking for a simple, quick transcription without the need for speaker diarization or precise word-level timing, or if you prefer a web-based, no-setup solution.

audio-transcription media-analysis qualitative-research meeting-minutes subtitling
No Package No Dependents
Maintenance 13 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

422

Forks

49

Language

Dockerfile

License

MIT

Last pushed

Mar 15, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/jim60105/docker-whisperX"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.