MOSS-TTSD and MOSS-TTS

MOSS-TTSD is a specialized extension of MOSS-TTS that adds dialogue-specific capabilities (long-context modeling, multi-speaker synthesis) on top of the core TTS functionality, making them ecosystem siblings where MOSS-TTSD builds upon MOSS-TTS for conversational applications.

MOSS-TTSD

Established

MOSS-TTS

Established

Maintenance 13/25

Adoption 10/25

Maturity 15/25

Community 19/25

Maintenance 17/25

Adoption 10/25

Maturity 11/25

Community 17/25

Stars: 1,202

Forks: 116

Downloads: —

Commits (30d): 3

Language: Python

License: Apache-2.0

Stars: 922

Forks: 82

Downloads: —

Commits (30d): 16

Language: Python

License: Apache-2.0

No Package No Dependents

About MOSS-TTSD

OpenMOSS/MOSS-TTSD

MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enabling zero-shot voice cloning from short audio references.

This project helps content creators transform dialogue scripts into dynamic, expressive spoken conversations with multiple distinct speakers. You provide a script and short audio references for each speaker, and it generates natural-sounding, long-form spoken dialogue up to 60 minutes. It's ideal for producers of podcasts, audiobooks, commentary, and dubbed content.

audiobook production podcast creation media dubbing voice content conversational AI

About MOSS-TTS

OpenMOSS/MOSS-TTS

MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenarios, covering stable long‑form speech, multi‑speaker dialogue, voice/character design, environmental sound effects, and real‑time streaming TTS.

The MOSS-TTS Family helps you create incredibly realistic and expressive speech and sound effects from text. You provide text and receive high-quality audio that sounds like a real person, handles multiple speakers, and can even generate unique voices or environmental sounds. This is perfect for content creators, game developers, virtual assistant designers, and anyone needing advanced audio generation.

content-creation voice-acting game-audio virtual-assistants audio-production

Related comparisons

MOSS-TTSD and MOSS-Speech MOSS-TTSD and MOSS-Speech

Scores updated daily from GitHub, PyPI, and npm data. How scores work