Zero-Shot Voice Synthesis Voice AI Tools

Tools for synthesizing speech with zero-shot or few-shot learning, enabling speaker cloning, emotion control, style transfer, and voice conversion without extensive training data. Does NOT include general text-to-speech engines, ASR systems, or non-zero-shot voice synthesis approaches.

There are 53 zero-shot voice synthesis tools tracked. 4 score above 50 (established tier). The highest-rated is index-tts/index-tts at 63/100 with 19,454 stars. 2 of the top 10 are actively maintained.

Get all 53 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=voice-ai&subcategory=zero-shot-voice-synthesis&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 index-tts/index-tts

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

63
Established
2 stepfun-ai/Step-Audio-EditX

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model...

54
Established
3 lucasnewman/f5-tts-mlx

Implementation of F5-TTS in MLX

54
Established
4 unilight/seq2seq-vc

A sequence-to-sequence voice conversion toolkit.

53
Established
5 FireRedTeam/FireRedTTS

An Open-Sourced LLM-empowered Foundation TTS System

46
Emerging
6 RaduBolbo/F5-TTS-Emotional-CFG

Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class...

45
Emerging
7 ubisoft/ubisoft-laforge-daft-exprt

Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

45
Emerging
8 Kyubyong/cross_vc

Cross-lingual Voice Conversion

45
Emerging
9 Edresson/YourTTS

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion...

45
Emerging
10 lucasnewman/f5-tts-swift

Implementation of F5-TTS in Swift using MLX

44
Emerging
11 JosefAlbers/e2tts-mlx

Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX

44
Emerging
12 hi-paris/Prosody-Control-French-TTS

An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control

43
Emerging
13 keonlee9420/Cross-Speaker-Emotion-Transfer

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based...

43
Emerging
14 WangHelin1997/SSR-Speech

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis

41
Emerging
15 Emotional-Text-to-Speech/hmm-for-emo-tts

:computer: A repository with comprehensive instructions for using the...

41
Emerging
16 keonlee9420/Robust_Fine_Grained_Prosody_Control

PyTorch Implementation of Robust and fine-grained prosody control of...

40
Emerging
17 adelacvg/ttts

Train the next generation of TTS systems.

40
Emerging
18 uetuluk/xcodec2-infer-lib

CPU support for xcodec2

39
Emerging
19 aiola-lab/drax

Drax: Speech Recognition with Discrete Flow Matching

38
Emerging
20 hcy71o/SC-CNN

SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker...

38
Emerging
21 lucasnewman/descript-mlx

Implementation of the Descript Audio Codec in MLX

37
Emerging
22 WelkinYang/Learn2Sing2.0

Diffusion and Mutual Information-Based Target Speaker SVS by Learning from...

35
Emerging
23 ddlBoJack/MT4SSL

[INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL:...

33
Emerging
24 NN-Project-2/Emotion-TTS-Emebddings

This project explores zero-shot emotional speech synthesis using EMOD, a...

32
Emerging
25 ictnlp/ComSpeech

Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct...

31
Emerging
26 rishikksh20/Zero-Shot-TTS

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based...

31
Emerging
27 adelacvg/detail_tts

All generative model in one for better TTS model

30
Emerging
28 lordzuko/cross-text-PT

Improving the Appropriateness in Cross-Text Prosody Transfer using Human Supervision

30
Emerging
29 CMsmartvoice/Unet-TTS

One-shot TTS with Improved Unseen Speaker and Style Transfer

30
Emerging
30 zhenye234/FlashSpeech

ACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis

29
Experimental
31 fmiotello/fastVC

A simple voice conversion tool

29
Experimental
32 xuan3986/UDDETTS

The first LLM that unifies discrete and dimensional emotions for...

29
Experimental
33 jishengpeng/ControlSpeech

[ACL 2025 Main] ControlSpeech: Towards Simultaneous Zero-shot Speaker...

29
Experimental
34 ORI-Muchim/Grad-TTS

'Grad-TTS' with Multilingual Cleaners

28
Experimental
35 WelkinYang/EMPHASIS-pytorch

EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System

28
Experimental
36 jzmzhong/Automatic-Prosody-Annotator-with-SSWP-CLAP

An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).

27
Experimental
37 Rumeysakeskin/Turkish-Text-to-Speech

Speech synthesis (TTS) in low-resource languages by training from scratch...

27
Experimental
38 MotivationalSpeechSynthesis/motivational-speech-synthesis

Artistic research deconstructing the performative excess of motivational...

24
Experimental
39 NassimaOULDOUALI/Prosody-Control-French-TTS

An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control

23
Experimental
40 adelacvg/DPTTS

An AR+AR TTS attempt.

23
Experimental
41 the-bird-F/Expressive-Vectors

[ICASSP 2026] Task Vector in TTS: Toward Emotionally Expressive Dialectal...

23
Experimental
42 wenhuahuo/Cross-Device-Acoustic-Communication-Python-Implementation

Digital acoustic communication tools using QFSK and Convolutional Encode. 跨设备声学通信。

21
Experimental
43 Wonbin-Jung/e3-vits

Official GitHub page of E3-VITS

21
Experimental
44 01Zhangbw/Awesome-Expressive-speech-synthesis

This is a summary of Expressive speech synthesis papers. Now update: 13 May.

14
Experimental
45 nipponjo/tts-german-pytorch

🎙️ German TTS (FastPitch) with Thorsten voice / emotional

13
Experimental
46 ZET-Speech/ZET-Speech-Demo

ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis...

13
Experimental
47 rendchevi/daisy-tts

🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding...

13
Experimental
48 maum-ai/sane-tts

SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech

13
Experimental
49 morelen17/tts-papers

List of papers about TTS / Список статей о TTS

13
Experimental
50 yqli2420/speech_synthesis_and_speech_recognition_papers

tts papers: http://yqli.tech/page/tts_paper.html

12
Experimental
51 sungjae-cho/ICASSP2020_STDemo

Show and Tell demonstration homepage

11
Experimental
52 deepbrainai-research/stableformtts

Project page for StableForm-TTS: Improving Robustness of Diffusion-Based...

11
Experimental
53 p-hennel/F5-TTS-MLX

Using F5-TTS and MLX for long-form text-to-speech.

10
Experimental

Comparisons in this category