FireRedTeam/FireRedASR2S
A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc modules. FireRedASR2 supports Chinese (Mandarin, 20+ dialects/accents), English, code-switching, and both speech and singing ASR. FireRedVAD supports speech/singing/music in 100+ langs. FireRedLID supports 100+ langs and 20+ zh dialects. FireRedPunc supports zh and en.
This system helps professionals accurately transcribe spoken audio, including both speech and singing, into text. It takes audio files in various languages and Chinese dialects and outputs precise text transcriptions, often with punctuation, language identification, and speech/music segmentation. This is ideal for content creators, researchers, and anyone needing detailed, accurate text from audio recordings.
365 stars.
Use this if you need highly accurate, industrial-grade transcriptions from diverse audio, especially for Chinese (Mandarin and dialects), English, or code-switching, including word-level timestamps and confidence scores.
Not ideal if your primary need is for a system that only processes a single, common language without advanced features like dialect identification or singing transcription.
Stars
365
Forks
20
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 13, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/FireRedTeam/FireRedASR2S"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
meizhong986/WhisperJAV
ASR/STT subtitle generator. Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD. Noise-robust for JAV
itsmevictor/clean-transcribe
A simple CLI to transcribe Youtube videos or local audio/video files and produce LLM-cleaned...
vivekuppal/transcribe
Transcribe is a real time transcription, conversation, Language learning platform. It provides...
BryceWG/BiBi-Keyboard
说点啥(BiBi Keyboard):一个基于 Kotlin 的 Android 平台的 LLM 与 ASR 语音输入法键盘应用 An LLM ASR voice input method...
sindresorhus/awesome-whisper
🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI