jianchang512/stt
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
This tool helps you convert spoken words from audio or video files into written text. You simply upload your media, choose the language and desired output format (like plain text, SRT subtitles with timestamps, or JSON), and it generates the transcript. This is perfect for content creators, transcribers, or anyone needing to quickly document spoken content from media without relying on online services.
4,331 stars.
Use this if you need to quickly and accurately transcribe audio or video content into text, subtitles, or a structured data format while keeping your data offline.
Not ideal if you require real-time transcription for live events or need advanced speaker diarization features.
Stars
4,331
Forks
463
Language
Python
License
GPL-3.0
Category
Last pushed
Jan 22, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/jianchang512/stt"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
cyberofficial/Synthalingua
Synthalingua - Real Time Translation
Jaymon/transcribe
Convert images or audio files to plain text on the command line
developers-cosmos/Mimasa
Real time multilingual face translator
lperezmo/real-time-translator
A quick app to translate speech in real time using the Whisper API for transcribing audio,...
book000/audio-transcriber-docker
Automatically transcribe the audio of video / audio files using Speech Recognition.