OpenMOSS/MOSS-Speech
MOSS-Speech is a true speech-to-speech large language model without text guidance.
This project helps create direct, natural voice-to-voice interactions for spoken applications. You provide spoken input, and it responds directly with spoken output, without ever converting to text in between. It's designed for anyone building interactive voice assistants, dialogue systems, or real-time spoken translation tools.
127 stars.
Use this if you need a speech-to-speech system that offers more natural conversations and avoids the limitations of text-based processing.
Not ideal if your workflow specifically requires a text transcript of the spoken input or output, or if you need to perform text-based analysis.
Stars
127
Forks
7
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 13, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/OpenMOSS/MOSS-Speech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
travisvn/chatterbox-tts-api
Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling users to generate...
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment...
fishaudio/Bert-VITS2
vits2 backbone with multilingual-bert
sfortis/openai_tts
Custom TTS component for Home Assistant. Utilizes the OpenAI speech engine or any compatible...
OpenMOSS/MOSS-TTSD
MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis....