nari-labs/dia

A TTS model capable of generating ultra-realistic dialogue in one pass.

50
/ 100
Established

This project helps creators and developers transform written dialogue into natural-sounding speech. You input a script with speaker tags and desired non-verbal cues (like laughter), and it generates realistic audio. It's designed for content creators, game developers, or anyone needing high-quality, expressive voiceovers for multi-speaker content.

19,202 stars.

Use this if you need to quickly generate realistic, multi-speaker dialogue with emotional nuances and non-verbal sounds from text.

Not ideal if you need to generate very short (under 5 seconds) or very long (over 20 seconds) audio segments, as this can lead to unnatural speech.

voiceover content-creation audio-production game-development narration
No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 15 / 25
Community 19 / 25

How are scores calculated?

Stars

19,202

Forks

1,683

Language

Python

License

Apache-2.0

Last pushed

Nov 19, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/nari-labs/dia"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.