alesaccoia/VoiceStreamAI
Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS
This solution provides near-realtime transcription for live audio streams. It takes live spoken audio, processes it using advanced speech recognition models, and outputs text as people speak. It's designed for anyone who needs to quickly convert spoken words into text during a live conversation or presentation, such as transcribers, journalists, or meeting facilitators.
950 stars. No commits in the last 6 months.
Use this if you need to transcribe live spoken audio directly from a microphone into text with minimal delay, for applications like live captioning or interactive voice assistants.
Not ideal if you primarily need to transcribe pre-recorded audio files or if you require extensive speaker diarization (identifying who spoke when).
Stars
950
Forks
142
Language
Python
License
MIT
Category
Last pushed
Oct 02, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/alesaccoia/VoiceStreamAI"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
collabora/WhisperLive
A nearly-live implementation of OpenAI's Whisper.
Kieirra/murmure
Fully local, private and cross platform Speech-to-Text with LLM Post-processing
Softcatala/whisper-ctranslate2
Whisper command line client compatible with original OpenAI client based on CTranslate2.
pavelzbornik/whisperX-FastAPI
FastAPI service on top of WhisperX
royshil/obs-localvocal
OBS plugin for local speech recognition and captioning using AI