nari-labs/dia
A TTS model capable of generating ultra-realistic dialogue in one pass.
This project helps creators and developers transform written dialogue into natural-sounding speech. You input a script with speaker tags and desired non-verbal cues (like laughter), and it generates realistic audio. It's designed for content creators, game developers, or anyone needing high-quality, expressive voiceovers for multi-speaker content.
19,202 stars.
Use this if you need to quickly generate realistic, multi-speaker dialogue with emotional nuances and non-verbal sounds from text.
Not ideal if you need to generate very short (under 5 seconds) or very long (over 20 seconds) audio segments, as this can lead to unnatural speech.
Stars
19,202
Forks
1,683
Language
Python
License
Apache-2.0
Category
Last pushed
Nov 19, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/nari-labs/dia"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
devnen/Chatterbox-TTS-Server
Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible...
jamiepine/voicebox
The open-source voice synthesis studio
daswer123/xtts-api-server
A simple FastAPI Server to run XTTSv2
Aivis-Project/AivisSpeech-Engine
AivisSpeech Engine: AI Voice Imitation System - Text to Speech Engine
jianchang512/ChatTTS-ui
一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to...