pavelzbornik/whisperX-FastAPI
FastAPI service on top of WhisperX
This tool helps convert audio and video recordings into text transcripts, identifying different speakers and aligning the text with the audio. You provide an audio or video file (like an interview, meeting recording, or lecture) and receive a detailed text output, making it easier to analyze spoken content. Anyone who needs to extract written information from spoken content, such as journalists, researchers, or content creators, would find this useful.
174 stars.
Use this if you regularly need accurate, speaker-separated transcripts from audio or video files and want to automate this process.
Not ideal if you only need occasional, simple transcriptions without speaker identification or precise timing, as it requires a specific technical setup.
Stars
174
Forks
58
Language
Python
License
MIT
Category
Last pushed
Mar 17, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/pavelzbornik/whisperX-FastAPI"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
collabora/WhisperLive
A nearly-live implementation of OpenAI's Whisper.
Kieirra/murmure
Fully local, private and cross platform Speech-to-Text with LLM Post-processing
Softcatala/whisper-ctranslate2
Whisper command line client compatible with original OpenAI client based on CTranslate2.
royshil/obs-localvocal
OBS plugin for local speech recognition and captioning using AI
kurianbenoy/whisper_normalizer
A python package for whisper normalizer