All Voice AI Tools

8,165 tools ranked by quality score · Page 10 of 82

Showing 901–1000 of 8,165

« Prev Next »

#	Tool	Score	Tier	Category	Stars	Language
901	Kaljurand/speechutils Android library for speech-to-text and text-to-speech apps	45	Emerging	android-speech-apps	90	Java
902	vineeths96/Spoken-Keyword-Spotting In this repository, we explore using a hybrid system consisting of a...	45	Emerging	wake-word-detection	107	Python
903	AlexandaJerry/whisper-vits-japanese Vits Japanese with Whisper as data processor (you can train your VITS even...	45	Emerging	vits-tts-implementations	162	Jupyter Notebook
904	baidubce/pie 百度云流式语音识别客户端 SDK	45	Emerging	java-tts-libraries	80	Java
905	GetcharZp/go-speech go-speech 基于 Golang + ONNX 构建的轻量语音库，支持 TTS（文本转语音）与 ASR（语音转文字）。已集成...	45	Emerging	go-tts-libraries	46	Go
906	finchvox/finchvox Voice AI Observability, Elevated	45	Emerging	self-hosted-tts-servers	24	Python
907	Oknolaz/vasisualy Vasisualy it's a simple Russian-language voice assistant written on Python...	45	Emerging	general-purpose-voice-assistants	68	Python
908	szimek/webrtc-translate Highly experimental (read: "barely working") app that uses WebRTC API and...	45	Emerging	live-meeting-translation	75	JavaScript
909	jaco-bro/diajax Dia-JAX: A JAX port of Dia, the text-to-speech model for generating...	45	Emerging	self-hosted-tts-servers	29	Jupyter Notebook
910	Elleo/pied Pied makes it simple to install and manage text-to-speech Piper voices for...	45	Emerging	piper-tts-ecosystem	258	Dart
911	skit-ai/kaldi-serve Server framework for Kaldi ASR Toolkit	45	Emerging	kaldi-asr-ecosystem	99	C++
912	juliuskunze/speechless Speech-to-text based on wav2letter built for transfer learning	45	Emerging	wav2vec2-asr-models	98	Python
913	HAKORADev/VODER Voice Operation and Design Engine with Reproduction capabilities	45	Emerging	neural-vocoder-implementations	116	Python
914	pth2000/PowerPointReviewer 一个基于PySide6实现的演讲稿朗读审阅工具，使用TTS引擎朗读PPT中的备注部分，从而辅助您进一步完善演讲的内容与措辞，助您顺利完成精彩的PPT演讲与展示。	45	Emerging	lightweight-tts-libraries	17	Python
915	fikrikarim/parlor On-device, real-time multimodal AI. Have natural voice and vision...	45	Emerging	tts	202	HTML
916	Kyubyong/cross_vc Cross-lingual Voice Conversion	45	Emerging	zero-shot-voice-synthesis	97	Python
917	danthelion/doc2audiobook Convert text documents to high fidelity audio(books).	45	Emerging	pdf-to-audio-conversion	204	Python
918	pofice/voice-input-method AI native的跨平台离线语音输入法	45	Emerging	—	26	Python
919	alexiokay/AriLink Modern ARI-STASI server, built on Asterisk ARI with real-time speech-to-text...	45	Emerging	ai-tutoring-platforms	10	TypeScript
920	thewh1teagle/phonikud-tts phonikud-tts - text to speech in Hebrew	45	Emerging	grapheme-to-phoneme-conversion	11	Python
921	inevolin/DiscordSpeechBot A speech-to-text bot for discord with music commands and more using NodeJS....	45	Emerging	discord-tts-bots	20	JavaScript
922	silversparro/wav2letter.pytorch A fully convolution-network for speech-to-text, built on pytorch.	45	Emerging	wav2vec2-asr-models	126	Python
923	awsaf49/audio_classification_models Tensorflow Audio Classification Models	45	Emerging	keyword-speech-recognition	13	Jupyter Notebook
924	Camb-ai/MARS5-TTS MARS5 speech model (TTS) from CAMB.AI	45	Emerging	voice-cloning-tools	2,814	Jupyter Notebook
925	AIGC-Audio/AudioGPT AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head	45	Emerging	voice-chatgpt-interfaces	10,210	Python
926	RapidAI/RapidASR 📣 商用级开源语音自动识别程序库，开箱即用，全平台支持，中英文混合识别。A Cross-platform implementation of ASR...	45	Emerging	funasr-speech-recognition	602	C++
927	Picovoice/falcon On-device speaker diarization powered by deep learning	45	Emerging	speaker-diarization-embedding	69	Python
928	Cay-Zhang/SwiftSpeech A speech recognition framework designed for SwiftUI.	45	Emerging	ios-speech-frameworks	527	Swift
929	IBM/speech-to-text-code-pattern WARNING: This repository is no longer maintained	45	Emerging	google-tts-libraries	46	JavaScript
930	spring-media/DeepPhonemizer Grapheme to phoneme conversion with deep learning.	45	Emerging	text-to-speech-frameworks	421	Python
931	voicetestdev/voicetest Test harness for voice agents. Import from Retell, VAPI, Bland, LiveKit. Run...	45	Emerging	voice-agent-applications	10	Python
932	gsssrao/UnityAndroidSpeechRecognition This repository is a Unity plugin for Android Speech Recognition (based on...	45	Emerging	dotnet-tts-libraries	85	Java
933	rhasspy/rhasspy Offline private voice assistant for many human languages	45	Emerging	general-purpose-voice-assistants	2,725	Shell
934	deepgram-starters/django-transcription Get started using Deepgram's Transcription with this Django demo app	45	Emerging	deepgram-starter-projects	7	Python
935	Tinkoff/voicekit-examples Examples on how to use Tinkoff Voicekit	45	Emerging	yandex-speechkit-tools	57	C#
936	huggingface/distil-whisper Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller,...	45	Emerging	whisper-fine-tuning	4,056	Python
937	theblackcat102/edgedict Working online speech recognition based on RNN Transducer. ( Trained model...	45	Emerging	end-to-end-asr-frameworks	292	Python
938	maum-ai/univnet Unofficial PyTorch Implementation of UnivNet Vocoder...	45	Emerging	text-to-speech-frameworks	282	Python
939	synesthesiam/rhasspy Rhasspy voice assistant for offline home automation	45	Emerging	general-purpose-voice-assistants	952	HTML
940	Onuronon-lab/Shrutik Open-source voice data collection platform for building inclusive voice...	45	Emerging	audio-transcription-apps	11	Python
941	ibotplus/kbase-media 视频、音频、图片内容识别、语音转写、语音合成 / easy convert video audio image to text, and revert...	45	Emerging	java-tts-libraries	24	Java
942	daanzu/deepspeech-websocket-server Server & client for DeepSpeech using WebSockets for real-time speech...	45	Emerging	parakeet-asr-implementations	103	Python
943	aman179102/podvoice Local-first CLI that turns Markdown scripts into multi-speaker podcast-style...	45	Emerging	gradio-tts-webuis	25	Python
944	arghyasur1991/LiveTalk-Unity LiveTalk is a unified, high-performance talking head generation system that...	45	Emerging	unity-ml-inference	25	C#
945	yukukotani/pi-voice Headless voice interface for the Pi Coding Agent	45	Emerging	voice-controlled-robotics	46	TypeScript
946	zhao-kun/VibeVoiceFusion VibeVoiceFusion is a full-stack, multi-speaker voice generation web system...	45	Emerging	qwen3-tts-applications	453	Python
947	WindQAQ/listen-attend-and-spell Tensorflow implementation of "Listen, Attend and Spell" authored by William...	45	Emerging	conformer-asr-implementations	89	Python
948	cool-japan/voirs VoiRS is a cutting-edge Text-to-Speech (TTS), Voice Recognition, Sound...	45	Emerging	text-to-speech-conversion	23	Rust
949	thewh1teagle/piper-rs Use piper TTS models in Rust	45	Emerging	rust-tts-libraries	49	Rust
950	benjaminwan/ChineseTtsTflite Android Chinese TTS Engine Base On Tensorflow TTS , use for TfLite Models...	45	Emerging	lightweight-tts-runtimes	393	Java
951	simonw/ospeak CLI tool for running text through OpenAI Text to speech	45	Emerging	openai-tts-applications	171	Python
952	NickZaitsev/ru-normalizr ru-normalizr — лучший open-source нормализатор русского текста. Приводит...	45	Emerging	text-normalization-engines	8	Python
953	mush42/sonata-nvda This add-on implements a speech synthesizer driver for NVDA using neural TTS...	45	Emerging	piper-tts-ecosystem	67	Python
954	rishikksh20/Fre-GAN-pytorch Fre-GAN: Adversarial Frequency-consistent Audio Synthesis	45	Emerging	neural-vocoder-implementations	111	Python
955	tarepan/VoiceConversionLab Collect Voice Conversion researches	45	Emerging	voice-cloning-synthesis	96	TypeScript
956	rohit-lakhanpal/ai-hackathon-starter-kit This project has been created to make AI accessible and easy for everyone....	45	Emerging	ai-tutoring-platforms	26	TypeScript
957	keenresearch/KeenASR-Android-PoC A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING:...	45	Emerging	java-tts-libraries	28	Java
958	Amirrezahmi/SelfTalker Engage in conversation with your virtual self using AI techniques like NLP,...	45	Emerging	ai-chatbot-interfaces	85	Jupyter Notebook
959	GuillaumeFalourd/formulas-python Ritchie CLI formulas in Python 🐍	45	Emerging	voice-ai-learning-collections	17	Python
960	dictate-button/dictate-button Customizable Web Component that adds speech-to-text dictation capabilities...	45	Emerging	web-speech-api-libraries	16	TypeScript
961	nl8590687/ASRT_SDK_WinClient An Windows client SDK and Demo software for ASRT speech recognition system....	45	Emerging	java-tts-libraries	71	C#
962	trldvix/youtube-transcript-api Java library which allows you to retrieve subtitles/transcripts for a single...	45	Emerging	video-transcription-extraction	37	Java
963	Edresson/YourTTS YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion...	45	Emerging	zero-shot-voice-synthesis	1,052	Jupyter Notebook
964	chenyme/Chenyme-AAVT 这是一个全自动（音频）视频翻译项目。利用Whisper识别声音，AI大模型翻译字幕，最后合并字幕视频，生成翻译后的视频。	45	Emerging	video-transcription-extraction	2,973	Python
965	earlephilhower/BackgroundAudio Arduino library for easy, interrupt driven speech, MP3, AAC, and WAV...	44	Emerging	embedded-tts-systems	119	C
966	FelixWaweru/elevenlabs-node Eleven Labs text to speech package for NodeJS. You can use the official...	44	Emerging	elevenlabs-integrations	181	JavaScript
967	markomijic/TTS-Mod-Vault Cross-platform Tabletop Simulator mod backup & download tool — the modern...	44	Emerging	dotnet-tts-libraries	55	Dart
968	zslrmhb/Omniverse-Virtual-Assisstant Audio2Face Avatar with Riva SDK functionality	44	Emerging	ai-avatar-platforms	75	Python
969	puntorigen/podcast_tts A class for generating realistic audio (TTS) for podcasts and dialogues.	44	Emerging	content-to-podcast-converters	65	Python
970	notAI-tech/IndicASR Speeech Recognition for Indic languages.	44	Emerging	wav2vec2-speech-recognition	13	Python
971	blip-radar/vatsim-parser Parser for a variety of VATSIM-related file formats	44	Emerging	rust-tts-libraries	4	Rust
972	HeyHeyChicken/NOVA-NodeJS NOVA is a customizable voice assistant made with Node.js.	44	Emerging	voice-assistant-applications	86	JavaScript
973	zw76859420/ASR_WORD 采用端到端方法构建声学模型，以字为建模单元，采用DCNN-CTC网络结构。	44	Emerging	ctc-asr-implementations	70	Python
974	NotAbhinavGamerz/emotion-aware-automatic-speech-recognition 🎤 Enhance speech recognition by detecting emotions in spoken language,...	44	Emerging	speech-emotion-recognition	4	Python
975	neosun100/cosyvoice-docker 🎙️ CosyVoice All-in-One Docker - Production-ready TTS with Web UI, REST API...	44	Emerging	coqui-tts-applications	38	Python
976	snakers4/open_stt Open STT	44	Emerging	speech-recognition-apis	818	Python
977	Rubiksman78/MonikA.I Submod for MAS with AI based features	44	Emerging	interactive-ai-avatars	149	Python
978	tarun7r/SpeechAlgo A Comprehensive Speech Processing Algorithms Library for research and production use	44	Emerging	automatic-speech-recognition	15	Python
979	aofdev/vue-pwa-speech A Vue2 Performs synchronous speech recognition Speech to text Google Cloud...	44	Emerging	vue-speech-recognition	99	JavaScript
980	coqui-ai/TTS-papers 🐸 collection of TTS papers	44	Emerging	text-to-speech-frameworks	723	—
981	aofdev/vue-speech-streaming A Vue2 Streaming Speech Recognition Speech to text with Google Cloud Speech	44	Emerging	vue-speech-recognition	72	JavaScript
982	gurjar1/OmniDictate Free, open-source, real-time dictation for Windows. Runs locally (no...	44	Emerging	voice-dictation-typing	111	Python
983	inclusionAI/Ming-UniAudio Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing...	44	Emerging	voice-ai-learning-collections	435	Python
984	persephone-tools/persephone A tool for automatic phoneme transcription	44	Emerging	text-to-speech-frameworks	159	Python
985	r9y9/ttslearn ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)	44	Emerging	text-to-speech-frameworks	267	Jupyter Notebook
986	SEPIA-Framework/sepia-stt-server SEPIA server to support open-source speech recognition via WebSocket connection.	44	Emerging	parakeet-asr-implementations	136	Python
987	nodef/extra-googletts Generate speech audio from super long text through machine (via "Google...	44	Emerging	google-tts-libraries	5	JavaScript
988	jishengpeng/WavTokenizer [ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second...	44	Emerging	neural-vocoder-implementations	1,279	Python
989	dunky11/voicesmith [WIP] VoiceSmith makes training text to speech models easy.	44	Emerging	coqui-tts-applications	229	Python
990	sq2ips/sr0wx Unowocześniony projekt automatycznej radioamatorskiej stacji pogodowej sr0wx	44	Emerging	voice-controlled-robotics	8	Python
991	Ashish-Patnaik/kokoclone Voice Cloning, Now Inside Kokoro. Generate natural multilingual speech and...	44	Emerging	kokoro-tts-ecosystem	62	Python
992	coqui-ai/TTS-recipes 🐸TTS recipes for different datasets	44	Emerging	voice-cloning-synthesis	86	Shell
993	sljeff/anycast An AI-Powered Podcast App.	44	Emerging	educational-voice-apps	63	Dart
994	LSimon95/megatts2 Unoffical implementation of Megatts2	44	Emerging	voice-cloning-tools	288	Python
995	Pictalk-speech-made-easy/pictalk-frontend Pictalk is an open-source application designed to assist individuals with...	44	Emerging	react-native-voice-libraries	9	Vue
996	mozhou-tech/kim-voice-assistant Kim，your personal voice kit for Home Inteligence.	44	Emerging	general-purpose-voice-assistants	80	Python
997	shijincai/VibeVoice Archive of the official Microsoft VibeVoice repository (7B & 1.5B). Backup...	44	Emerging	qwen3-tts-applications	27	Python
998	yrom/finetune-index-tts IndexTTS Fine-tuning notebooks	44	Emerging	tts-model-finetuning	136	Jupyter Notebook
999	keonlee9420/Comprehensive-Transformer-TTS A Non-Autoregressive Transformer based Text-to-Speech, supporting a family...	44	Emerging	text-to-speech-frameworks	328	Python
1000	oddlama/whisper-overlay A wayland overlay providing speech-to-text functionality for any application...	44	Emerging	voice-dictation-typing	79	Rust

« Prev 1 2 3 … 8 9 10 11 12 … 80 81 82 Next »