OpenMOSS/MOSS-TTS

MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenarios, covering stable long‑form speech, multi‑speaker dialogue, voice/character design, environmental sound effects, and real‑time streaming TTS.

55
/ 100
Established

The MOSS-TTS Family helps you create incredibly realistic and expressive speech and sound effects from text. You provide text and receive high-quality audio that sounds like a real person, handles multiple speakers, and can even generate unique voices or environmental sounds. This is perfect for content creators, game developers, virtual assistant designers, and anyone needing advanced audio generation.

922 stars. Actively maintained with 16 commits in the last 30 days.

Use this if you need to generate high-fidelity, expressive human-like speech for long content, multi-speaker dialogues, real-time interactions, or design unique character voices and sound effects.

Not ideal if you only need basic, simple text-to-speech without advanced expressiveness, voice design, or multi-speaker capabilities.

content-creation voice-acting game-audio virtual-assistants audio-production
No Package No Dependents
Maintenance 17 / 25
Adoption 10 / 25
Maturity 11 / 25
Community 17 / 25

How are scores calculated?

Stars

922

Forks

82

Language

Python

License

Apache-2.0

Last pushed

Mar 13, 2026

Commits (30d)

16

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/OpenMOSS/MOSS-TTS"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.