All Voice AI Tools

8,165 tools ranked by quality score · Page 7 of 82

Showing 601–700 of 8,165

« Prev Next »

#	Tool	Score	Tier	Category	Stars	Language
601	gooofy/py-espeak-ng Some simple wrappers around eSpeak NG intended to make using this excellent...	48	Emerging	espeak-ng-ecosystem	43	Python
602	myshell-ai/MeloTTS High-quality multi-lingual text-to-speech library by MyShell.ai. Support...	48	Emerging	lightweight-tts-runtimes	7,267	Python
603	artcore-c/AI-Voice-Clone-with-Coqui-XTTS-v2 Free voice cloning for creators using Coqui XTTS-v2 on Google Colab. Clone...	48	Emerging	voice-cloning-tools	34	Python
604	solyarisoftware/voskJs Vosk ASR offline engine API for NodeJs developers. With a simple HTTP ASR server.	48	Emerging	vosk-asr-implementations	56	JavaScript
605	gooofy/zerovox zero-shot realtime TTS system, fully offline, free and open source	48	Emerging	text-to-speech-frameworks	51	Python
606	PraaneshSelvaraj/speech_engine Speech Engine is a Python package that provides a simple interface for...	48	Emerging	lightweight-tts-libraries	3	Python
607	andresayac/edge-tts Edge TTS is a Node or Bun package that allows access to the online...	48	Emerging	edge-tts-implementations	121	TypeScript
608	lucasjinreal/Kokoros 🔥🔥 Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast,...	48	Emerging	kokoro-tts-ecosystem	735	Rust
609	devnen/Dia-TTS-Server Self-host the powerful Dia TTS model. This server offers a user-friendly Web...	48	Emerging	self-hosted-tts-servers	346	Python
610	hehehai/voxt 🎙️Voice input and translation app for macOS. Press to talk, release to paste.	48	Emerging	local-voice-dictation	346	Swift
611	lucasnewman/best-rq-pytorch Implementation of BEST-RQ - a model for self-supervised learning of speech...	48	Emerging	neural-vocoder-implementations	133	Python
612	Kaljurand/dictate.js A small Javascript library for browser-based real-time speech recognition,...	48	Emerging	web-speech-api-libraries	217	JavaScript
613	filippogiruzzi/voice_activity_detection Voice Activity Detection based on Deep Learning & TensorFlow	48	Emerging	speaker-diarization-embedding	371	Python
614	KinglittleQ/GST-Tacotron A PyTorch implementation of Style Tokens: Unsupervised Style Modeling,...	48	Emerging	tacotron-tts-models	374	Python
615	abhirooptalasila/AutoSub A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using...	48	Emerging	whisper-subtitle-generation	651	Python
616	jpuigcerver/Laia Laia: A deep learning toolkit for HTR based on Torch	48	Emerging	text-to-speech-frameworks	151	Shell
617	Mag1cFall/AIStudio2API 将AI Studio反代成OpenAI兼容的API \| OpenAI-compatible API proxy for Google AI Studio	48	Emerging	openai-tts-applications	91	Python
618	feldberlin/timething Timething is a library for aligning text transcripts with their audio recordings.	48	Emerging	whisper-subtitle-generation	130	Jupyter Notebook
619	fulldecent/vowel-practice iOS application for finding formants in spoken sounds	48	Emerging	ios-speech-frameworks	66	Swift
620	BoltzmannEntropy/xtts2-ui A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech	48	Emerging	coqui-tts-applications	391	Python
621	Atm4x/tts-with-rvc TTS with RVC-module to generate .wav audios	48	Emerging	coqui-tts-applications	40	Python
622	gooofy/zamia-speech Open tools and data for cloudless automatic speech recognition	48	Emerging	automatic-speech-recognition	446	Python
623	pulijon/Sttcast Transcription from mp3 files to html with or without embedded player	48	Emerging	personal-assistant-rag	25	Jupyter Notebook
624	shashank2122/Local-Voice A real-time, offline voice assistant for Linux and Raspberry Pi. Uses local...	48	Emerging	voice-assistant-frameworks	34	Python
625	MerlinCN/kinoko7danmaku 调用TTS来播报哔哩哔哩直播中的弹幕、礼物、舰长等	48	Emerging	gradio-tts-webuis	24	Python
626	yl4579/AuxiliaryASR Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)	48	Emerging	end-to-end-asr-frameworks	125	Python
627	blaisewf/rvc-cli 🚀 RVC + UVR = A perfect set of tools for voice cloning, easily and free!	48	Emerging	voice-cloning-synthesis	230	Python
628	cboard-org/ccboard Cordova wrapper for the Cboard application	48	Emerging	react-native-voice-libraries	5	Shell
629	thepirat000/spleeter-api Audio separation API using Spleeter from Deezer	48	Emerging	audio-source-separation	121	C#
630	ivanvovk/durian-pytorch Implementation of "Duration Informed Attention Network for Multimodal...	48	Emerging	tacotron-tts-models	184	Python
631	mediatechlab/tts-wrapper TTS-Wrapper makes it easier to use text-to-speech APIs by providing a...	48	Emerging	lightweight-tts-libraries	21	Python
632	supertone-inc/supertonic-py Lightning-Fast, On-Device TTS — running natively via ONNX.	48	Emerging	lightweight-tts-runtimes	16	Python
633	dngda/bot-whatsapp Unmaintained - Multipurpose WhatsApp Bot 🤖 using open-wa/wa-automate-nodejs...	48	Emerging	telegram-voice-transcription	93	JavaScript
634	mozilla/TTS :robot: :speech_balloon: Deep learning for Text to Speech (Discussion...	48	Emerging	text-to-speech-frameworks	10,123	Jupyter Notebook
635	HiMeditator/auto-caption A cross-platform real-time subtitle display software. 一个跨平台的实时字幕显示软件。	48	Emerging	live-caption-generation	497	TypeScript
636	OvidijusParsiunas/speech-to-element A simple way to add speech to text functionality to your website :microphone:	48	Emerging	web-speech-api-libraries	20	TypeScript
637	haolinwang819-boop/ai-video-generation-workflow AI video generation workflow with script, slides, TTS, subtitles, and FFmpeg...	48	Emerging	ai-video-generation	179	TypeScript
638	XiaoMi/kaldi-onnx Kaldi model converter to ONNX	48	Emerging	kaldi-asr-ecosystem	247	Python
639	jxzhanggg/nonparaSeq2seqVC_code Implementation code of non-parallel sequence-to-sequence VC	48	Emerging	fastspeech-tts-models	248	Python
640	SuyashMore/MevonAI-Speech-Emotion-Recognition Identify the emotion of multiple speakers in an Audio Segment	48	Emerging	speech-emotion-recognition	179	C
641	jbelford/Eolian Eolian is a Discord music bot which provide a very powerful API for queuing...	48	Emerging	discord-tts-bots	23	TypeScript
642	harry0703/AudioNotes 快速提取音视频内容，整理成一份结构化的markdown笔记	48	Emerging	meeting-transcription-summarizers	1,993	Python
643	gentaiscool/end2end-asr-pytorch End-to-End Automatic Speech Recognition on PyTorch	48	Emerging	end-to-end-asr-frameworks	304	Python
644	haoheliu/voicefixer_main General Speech Restoration	48	Emerging	speaker-diarization-embedding	284	Python
645	tsurumeso/vocal-remover Vocal Remover using Deep Neural Networks	48	Emerging	audio-source-separation	1,744	Python
646	upskyy/Squeezeformer PyTorch implementation of "Squeezeformer: An Efficient Transformer for...	48	Emerging	conformer-asr-implementations	148	Python
647	PlayVoice/vits_chinese Best practice TTS based on BERT and VITS with some Natural Speech Features...	48	Emerging	vits-tts-implementations	1,227	Python
648	just-ai/aimybox-android-assistant Embeddable custom voice assistant for Android applications	48	Emerging	android-voice-assistants	274	Kotlin
649	scionoftech/DeepAsr Keras(Tensorflow) implementations of Automatic Speech Recognition	48	Emerging	ctc-asr-implementations	24	Jupyter Notebook
650	TUD-STKS/VocalTractLabBackend-dev The VocalTractLab backend sources and C/C++ API	48	Emerging	lightweight-tts-runtimes	17	C++
651	AutoArk/GPA [AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion...	48	Emerging	telegram-voice-transcription	97	Python
652	KKshitiz/J.A.R.V.I.S Iron man inspired Personal virtual assistant	48	Emerging	python-voice-assistants	72	Python
653	Picovoice/cobra On-device voice activity detection (VAD) powered by deep learning	48	Emerging	ios-speech-frameworks	248	Python
654	alesaccoia/VoiceStreamAI Near-Realtime audio transcription using self-hosted Whisper and WebSocket in...	48	Emerging	speech-to-text-converters	950	Python
655	rolczynski/Automatic-Speech-Recognition 🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)	48	Emerging	keyword-speech-recognition	223	Python
656	Aivis-Project/AivisSpeech AivisSpeech: AI Voice Imitation System - Text to Speech Software	48	Emerging	openai-tts-applications	423	TypeScript
657	gexgd0419/NaturalVoiceSAPIAdapter Make Azure natural TTS voices accessible to any SAPI 5-compatible application.	48	Emerging	dotnet-tts-libraries	702	C++
658	alumae/kaldi-offline-transcriber Offline transcription system for Estonian using Kaldi	48	Emerging	kaldi-asr-ecosystem	228	Python
659	gitmylo/bark-voice-cloning-HuBERT-quantizer The code for the bark-voicecloning model. Training and inference.	48	Emerging	voice-cloning-tools	711	Python
660	rishikksh20/FastSpeech2 PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End...	48	Emerging	fastspeech-tts-models	233	Jupyter Notebook
661	hegedustibor/htgo-tts Text to speech package for Golang.	48	Emerging	go-tts-libraries	213	Go
662	YaoFANGUK/video-subtitle-extractor 视频硬字幕提取，生成srt文件。无需申请第三方API，本地实现文本识别。基于深度学习的视频字幕提取框架，包含字幕区域检测、字幕内容提取。A GUI...	48	Emerging	whisper-transcription-apps	8,505	Python
663	clovaai/ClovaCall ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)	48	Emerging	end-to-end-asr-frameworks	223	Python
664	supersu-man/pyt2s The Python Text to Speech library you've been looking for.	48	Emerging	lightweight-tts-libraries	36	Python
665	WanderingAstronomer/Vociferous Vociferous captures audio from your microphone, transcribes it in real-time...	48	Emerging	speech-to-text-converters	13	Python
666	yl4579/PL-BERT Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions	48	Emerging	fastspeech-tts-models	268	Python
667	philipperemy/tensorflow-ctc-speech-recognition Application of Connectionist Temporal Classification (CTC) for Speech...	48	Emerging	ctc-asr-implementations	131	Python
668	Jaymon/transcribe Convert images or audio files to plain text on the command line	48	Emerging	real-time-voice-translation	30	Python
669	keenresearch/keenasr-ios-poc Proof of concept app that demonstrates use of KeenASR SDK in ObjC. WE ARE...	48	Emerging	ios-speech-frameworks	70	Objective-C
670	reriiasu/speech-to-text Real-time transcription using faster-whisper	48	Emerging	speech-to-text-converters	613	HTML
671	openspeech-team/openspeech Open-Source Toolkit for End-to-End Speech Recognition leveraging...	48	Emerging	end-to-end-asr-frameworks	718	Python
672	moshehbenavraham/Voice-Agent-PuPuPlatter Multi-provider voice AI showcase featuring 7 providers (ElevenLabs + Widget,...	48	Emerging	voice-command-assistants	14	TypeScript
673	EnjiRouz/Voice-Assistant-App Python Voice Assistant project can: recognize and synthesize speech without...	48	Emerging	general-purpose-voice-assistants	129	Python
674	alphacep/vosk-asterisk Speech Recognition in Asterisk with Vosk Server	47	Emerging	vosk-asr-implementations	128	C
675	itsRares/react-native-deepgram Brings Deepgram's capabilities to React Native applications, with a focus on...	47	Emerging	deepgram-starter-projects	6	TypeScript
676	LEEYOONHYUNG/BVAE-TTS Official implementation of BVAE-TTS	47	Emerging	text-to-speech-frameworks	173	Python
677	xkeyC/fl_caption Offline real-time captioning software written in Flutter and Rust, powered...	47	Emerging	flutter-ai-chat-apps	92	Dart
678	rishikksh20/VocGAN VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested...	47	Emerging	neural-vocoder-implementations	321	Python
679	jiaqili3/DualCodec [Interspeech 2025] DualCodec: A Low-Frame-Rate, Semantically-Enhanced Neural...	47	Emerging	speculative-decoding-algorithms	62	Jupyter Notebook
680	yl4579/PitchExtractor Deep Neural Pitch Extractor for Voice Conversion and TTS Training	47	Emerging	tacotron-tts-models	147	Python
681	lmnt-com/wavegrad A fast, high-quality neural vocoder.	47	Emerging	audio-noise-reduction	296	Python
682	seanghay/speechviewer A quick audio dataset viewer	47	Emerging	web-speech-api-libraries	5	JavaScript
683	chenliangrui/EasyMrcp 欢迎使用EasyMrcp！ EasyMrcp使用java编写，目前提供了多种不同的asr和tts的集成，做到真正简单使用ASR和TTS。...	47	Emerging	java-tts-libraries	51	Java
684	Deepest-Project/MelNet Implementation of "MelNet: A Generative Model for Audio in the Frequency Domain"	47	Emerging	neural-vocoder-implementations	210	Python
685	elimu-ai/vitabu 📚 Android application for reading storybooks and expanding word vocabulary.	47	Emerging	android-speech-apps	2	Kotlin
686	YoavRamon/awesome-kaldi This is a list of features, scripts, blogs and resources for better using...	47	Emerging	kaldi-asr-ecosystem	537	—
687	IS2AI/Kazakh_TTS An expanded version of the previously released Kazakh text-to-speech...	47	Emerging	tts-dataset-creation	147	Shell
688	agent87/RW-DEEPSPEECH-API An end to end deep speech REST API containing speech to text and text speech...	47	Emerging	coqui-tts-applications	12	Jupyter Notebook
689	alexruperez/SpeechRecognizerButton UIButton subclass with push to talk recording, speech recognition and...	47	Emerging	ios-speech-frameworks	184	Swift
690	saiteja-talluri/Speech2Face Implementation of the CVPR 2019 Paper - Speech2Face: Learning the Face...	47	Emerging	fastspeech-tts-models	178	Python
691	modelscope/KAN-TTS KAN-TTS is a speech-synthesis training framework, please try the demos we...	47	Emerging	tts-model-finetuning	526	Python
692	symblai/speech-recognition-evaluation Evaluate results from ASR/Speech-to-Text quickly	47	Emerging	asr-evaluation-metrics	41	JavaScript
693	Gautham495/react-native-speech-recognition-kit React Native Turbo Module to access Speech Recognition in Android & iOS	47	Emerging	react-native-voice-libraries	3	TypeScript
694	AppDevGuy/OSSSpeechKit OSSSpeechKit offers a native iOS Speech wrapper for AVFoundation and Apple's Speech.	47	Emerging	ios-speech-frameworks	181	Swift
695	cvqluu/Factorized-TDNN PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal...	47	Emerging	tacotron-tts-models	149	Python
696	bensonruan/Chrome-Web-Speech-API Chrome Web Speech API	47	Emerging	audio-transcription-apps	117	JavaScript
697	dspavankumar/keras-kaldi Keras Interface for Kaldi ASR	47	Emerging	kaldi-asr-ecosystem	122	Python
698	travisvn/obsidian-edge-tts Free, high quality text-to-speech for your Obsidian notes, leveraging...	47	Emerging	edge-tts-implementations	278	TypeScript
699	roatienza/efficientspeech PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.	47	Emerging	fastspeech-tts-models	180	Jupyter Notebook
700	512z/podlens Free Podwise: AI Podcast & Youtube Transcription & Understanding Agent \|...	47	Emerging	speech-to-text-transcription	5	Python

« Prev 1 2 3 … 5 6 7 8 9 … 80 81 82 Next »