tsmdt/whisply
💬 Fast, cross-platform CLI and GUI for batch transcription, translation, speaker annotation and subtitle generation using OpenAI’s Whisper on CPU, Nvidia GPU and Apple MLX.
This tool helps you quickly convert audio and video files into text. You provide your media files, and it generates precise transcriptions, translations, speaker annotations, and even subtitles. It's designed for anyone who needs to process many recordings, such as researchers, podcasters, or content creators, to make their content more accessible and searchable.
108 stars. Available on PyPI.
Use this if you need a fast, reliable way to transcribe, translate, or subtitle a large batch of audio or video files, including identifying different speakers.
Not ideal if you only need occasional, single-file transcription and prefer a web-based service over installing software.
Stars
108
Forks
16
Language
Python
License
MIT
Category
Last pushed
Mar 18, 2026
Commits (30d)
0
Dependencies
17
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/tsmdt/whisply"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
jim60105/docker-whisperX
Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker...
MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
linto-ai/linto-stt
An automatic speech recognition API
linto-ai/linto-studio
Transcription and annotation interface for recorded audio or video files