pavelzbornik/whisperX-FastAPI

FastAPI service on top of WhisperX

61
/ 100
Established

This tool helps convert audio and video recordings into text transcripts, identifying different speakers and aligning the text with the audio. You provide an audio or video file (like an interview, meeting recording, or lecture) and receive a detailed text output, making it easier to analyze spoken content. Anyone who needs to extract written information from spoken content, such as journalists, researchers, or content creators, would find this useful.

174 stars.

Use this if you regularly need accurate, speaker-separated transcripts from audio or video files and want to automate this process.

Not ideal if you only need occasional, simple transcriptions without speaker identification or precise timing, as it requires a specific technical setup.

transcription audio-analysis video-processing speaker-diarization content-analysis
No Package No Dependents
Maintenance 13 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 22 / 25

How are scores calculated?

Stars

174

Forks

58

Language

Python

License

MIT

Last pushed

Mar 17, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/pavelzbornik/whisperX-FastAPI"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.