Zero-Shot Voice Synthesis Voice AI Tools

Tools for synthesizing speech with zero-shot or few-shot learning, enabling speaker cloning, emotion control, style transfer, and voice conversion without extensive training data. Does NOT include general text-to-speech engines, ASR systems, or non-zero-shot voice synthesis approaches.

There are 53 zero-shot voice synthesis tools tracked. 4 score above 50 (established tier). The highest-rated is index-tts/index-tts at 63/100 with 19,454 stars. 2 of the top 10 are actively maintained.

Get all 53 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=voice-ai&subcategory=zero-shot-voice-synthesis&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	index-tts/index-tts An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System	63	Established	19,454	Python
2	stepfun-ai/Step-Audio-EditX A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model...	54	Established	884	Python
3	lucasnewman/f5-tts-mlx Implementation of F5-TTS in MLX	54	Established	611	Python
4	unilight/seq2seq-vc A sequence-to-sequence voice conversion toolkit.	53	Established	108	Jupyter Notebook
5	FireRedTeam/FireRedTTS An Open-Sourced LLM-empowered Foundation TTS System	46	Emerging	905	Python
6	RaduBolbo/F5-TTS-Emotional-CFG Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class...	45	Emerging	30	Python
7	ubisoft/ubisoft-laforge-daft-exprt Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis	45	Emerging	129	Python
8	Kyubyong/cross_vc Cross-lingual Voice Conversion	45	Emerging	97	Python
9	Edresson/YourTTS YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion...	45	Emerging	1,052	Jupyter Notebook
10	lucasnewman/f5-tts-swift Implementation of F5-TTS in Swift using MLX	44	Emerging	91	Swift
11	JosefAlbers/e2tts-mlx Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX	44	Emerging	29	Python
12	hi-paris/Prosody-Control-French-TTS An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control	43	Emerging	31	Python
13	keonlee9420/Cross-Speaker-Emotion-Transfer PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based...	43	Emerging	194	Python
14	WangHelin1997/SSR-Speech SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis	41	Emerging	147	Python
15	Emotional-Text-to-Speech/hmm-for-emo-tts :computer: A repository with comprehensive instructions for using the...	41	Emerging	50	CSS
16	keonlee9420/Robust_Fine_Grained_Prosody_Control PyTorch Implementation of Robust and fine-grained prosody control of...	40	Emerging	41	Python
17	adelacvg/ttts Train the next generation of TTS systems.	40	Emerging	171	Python
18	uetuluk/xcodec2-infer-lib CPU support for xcodec2	39	Emerging	6	Python
19	aiola-lab/drax Drax: Speech Recognition with Discrete Flow Matching	38	Emerging	75	Python
20	hcy71o/SC-CNN SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker...	38	Emerging	39	Python
21	lucasnewman/descript-mlx Implementation of the Descript Audio Codec in MLX	37	Emerging	10	Python
22	WelkinYang/Learn2Sing2.0 Diffusion and Mutual Information-Based Target Speaker SVS by Learning from...	35	Emerging	181	JavaScript
23	ddlBoJack/MT4SSL [INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL:...	33	Emerging	45	Python
24	NN-Project-2/Emotion-TTS-Emebddings This project explores zero-shot emotional speech synthesis using EMOD, a...	32	Emerging	18	Python
25	ictnlp/ComSpeech Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct...	31	Emerging	26	Python
26	rishikksh20/Zero-Shot-TTS Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based...	31	Emerging	34	Python
27	adelacvg/detail_tts All generative model in one for better TTS model	30	Emerging	74	Python
28	lordzuko/cross-text-PT Improving the Appropriateness in Cross-Text Prosody Transfer using Human Supervision	30	Emerging	2	Python
29	CMsmartvoice/Unet-TTS One-shot TTS with Improved Unseen Speaker and Style Transfer	30	Emerging	37	—
30	zhenye234/FlashSpeech ACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis	29	Experimental	155	Python
31	fmiotello/fastVC A simple voice conversion tool	29	Experimental	20	Python
32	xuan3986/UDDETTS The first LLM that unifies discrete and dimensional emotions for...	29	Experimental	8	Python
33	jishengpeng/ControlSpeech [ACL 2025 Main] ControlSpeech: Towards Simultaneous Zero-shot Speaker...	29	Experimental	275	Python
34	ORI-Muchim/Grad-TTS 'Grad-TTS' with Multilingual Cleaners	28	Experimental	11	Jupyter Notebook
35	WelkinYang/EMPHASIS-pytorch EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System	28	Experimental	15	Python
36	jzmzhong/Automatic-Prosody-Annotator-with-SSWP-CLAP An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).	27	Experimental	51	Python
37	Rumeysakeskin/Turkish-Text-to-Speech Speech synthesis (TTS) in low-resource languages by training from scratch...	27	Experimental	66	Python
38	MotivationalSpeechSynthesis/motivational-speech-synthesis Artistic research deconstructing the performative excess of motivational...	24	Experimental	2	Python
39	NassimaOULDOUALI/Prosody-Control-French-TTS An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control	23	Experimental	19	Python
40	adelacvg/DPTTS An AR+AR TTS attempt.	23	Experimental	18	Python
41	the-bird-F/Expressive-Vectors [ICASSP 2026] Task Vector in TTS: Toward Emotionally Expressive Dialectal...	23	Experimental	38	Python
42	wenhuahuo/Cross-Device-Acoustic-Communication-Python-Implementation Digital acoustic communication tools using QFSK and Convolutional Encode. 跨设备声学通信。	21	Experimental	9	Python
43	Wonbin-Jung/e3-vits Official GitHub page of E3-VITS	21	Experimental	9	HTML
44	01Zhangbw/Awesome-Expressive-speech-synthesis This is a summary of Expressive speech synthesis papers. Now update: 13 May.	14	Experimental	8	—
45	nipponjo/tts-german-pytorch 🎙️ German TTS (FastPitch) with Thorsten voice / emotional	13	Experimental	9	Python
46	ZET-Speech/ZET-Speech-Demo ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis...	13	Experimental	10	JavaScript
47	rendchevi/daisy-tts 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding...	13	Experimental	14	—
48	maum-ai/sane-tts SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech	13	Experimental	11	—
49	morelen17/tts-papers List of papers about TTS / Список статей о TTS	13	Experimental	10	—
50	yqli2420/speech_synthesis_and_speech_recognition_papers tts papers: http://yqli.tech/page/tts_paper.html	12	Experimental	5	—
51	sungjae-cho/ICASSP2020_STDemo Show and Tell demonstration homepage	11	Experimental	4	HTML
52	deepbrainai-research/stableformtts Project page for StableForm-TTS: Improving Robustness of Diffusion-Based...	11	Experimental	4	HTML
53	p-hennel/F5-TTS-MLX Using F5-TTS and MLX for long-form text-to-speech.	10	Experimental	2	Python

Comparisons in this category

f5-tts-mlx and f5-tts-swift (54 vs 44)