forfrt/SteerMoE
SteerMoE: Efficient Audio-Language Models with Preserved Reasoning Capabilities
SteerMoE helps you build AI models that can understand both spoken language and written text, while keeping the advanced reasoning abilities of large language models intact. You provide audio recordings and text prompts, and it generates text outputs like transcriptions, answers to questions about the audio, or complex textual reasoning. This is for AI practitioners, researchers, or developers who want to create powerful multi-modal AI applications without compromising language model performance.
Use this if you need an AI model that can accurately process speech and answer questions about audio, while fully preserving the sophisticated reasoning, text generation, and coding abilities of a large language model.
Not ideal if your primary goal is only basic speech transcription or if you are comfortable with your language model's reasoning capabilities being degraded after audio fine-tuning.
Stars
9
Forks
1
Language
Python
License
MIT
Category
Last pushed
Mar 16, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/forfrt/SteerMoE"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Spr-Aachen/Easy-Voice-Toolkit
A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
PrzemyslawSwiderski/python-gradle-plugin
Gradle plugin to run Python projects.
alphacep/awesome-russian-speech
Russian speech technology links
ftyers/commonvoice-utils
Linguistic processing for Common Voice
microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech