All Voice AI Tools
8,165 tools ranked by quality score · Page 10 of 82
| # | Tool | Score | Tier |
|---|---|---|---|
| 901 |
Kaljurand/speechutils
Android library for speech-to-text and text-to-speech apps |
|
Emerging |
| 902 |
vineeths96/Spoken-Keyword-Spotting
In this repository, we explore using a hybrid system consisting of a... |
|
Emerging |
| 903 |
AlexandaJerry/whisper-vits-japanese
Vits Japanese with Whisper as data processor (you can train your VITS even... |
|
Emerging |
| 904 |
baidubce/pie
百度云流式语音识别客户端 SDK |
|
Emerging |
| 905 |
GetcharZp/go-speech
go-speech 基于 Golang + ONNX 构建的轻量语音库,支持 TTS(文本转语音)与 ASR(语音转文字)。已集成... |
|
Emerging |
| 906 |
finchvox/finchvox
Voice AI Observability, Elevated |
|
Emerging |
| 907 |
Oknolaz/vasisualy
Vasisualy it's a simple Russian-language voice assistant written on Python... |
|
Emerging |
| 908 |
szimek/webrtc-translate
Highly experimental (read: "barely working") app that uses WebRTC API and... |
|
Emerging |
| 909 |
jaco-bro/diajax
Dia-JAX: A JAX port of Dia, the text-to-speech model for generating... |
|
Emerging |
| 910 |
Elleo/pied
Pied makes it simple to install and manage text-to-speech Piper voices for... |
|
Emerging |
| 911 |
skit-ai/kaldi-serve
Server framework for Kaldi ASR Toolkit |
|
Emerging |
| 912 |
juliuskunze/speechless
Speech-to-text based on wav2letter built for transfer learning |
|
Emerging |
| 913 |
HAKORADev/VODER
Voice Operation and Design Engine with Reproduction capabilities |
|
Emerging |
| 914 |
pth2000/PowerPointReviewer
一个基于PySide6实现的演讲稿朗读审阅工具,使用TTS引擎朗读PPT中的备注部分,从而辅助您进一步完善演讲的内容与措辞,助您顺利完成精彩的PPT演讲与展示。 |
|
Emerging |
| 915 |
fikrikarim/parlor
On-device, real-time multimodal AI. Have natural voice and vision... |
|
Emerging |
| 916 |
Kyubyong/cross_vc
Cross-lingual Voice Conversion |
|
Emerging |
| 917 |
danthelion/doc2audiobook
Convert text documents to high fidelity audio(books). |
|
Emerging |
| 918 |
pofice/voice-input-method
AI native的跨平台离线语音输入法 |
|
Emerging |
| 919 |
alexiokay/AriLink
Modern ARI-STASI server, built on Asterisk ARI with real-time speech-to-text... |
|
Emerging |
| 920 |
thewh1teagle/phonikud-tts
phonikud-tts - text to speech in Hebrew |
|
Emerging |
| 921 |
inevolin/DiscordSpeechBot
A speech-to-text bot for discord with music commands and more using NodeJS.... |
|
Emerging |
| 922 |
silversparro/wav2letter.pytorch
A fully convolution-network for speech-to-text, built on pytorch. |
|
Emerging |
| 923 |
awsaf49/audio_classification_models
Tensorflow Audio Classification Models |
|
Emerging |
| 924 |
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI |
|
Emerging |
| 925 |
AIGC-Audio/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head |
|
Emerging |
| 926 |
RapidAI/RapidASR
📣 商用级开源语音自动识别程序库,开箱即用,全平台支持,中英文混合识别。A Cross-platform implementation of ASR... |
|
Emerging |
| 927 |
Picovoice/falcon
On-device speaker diarization powered by deep learning |
|
Emerging |
| 928 |
Cay-Zhang/SwiftSpeech
A speech recognition framework designed for SwiftUI. |
|
Emerging |
| 929 |
IBM/speech-to-text-code-pattern
WARNING: This repository is no longer maintained |
|
Emerging |
| 930 |
spring-media/DeepPhonemizer
Grapheme to phoneme conversion with deep learning. |
|
Emerging |
| 931 |
voicetestdev/voicetest
Test harness for voice agents. Import from Retell, VAPI, Bland, LiveKit. Run... |
|
Emerging |
| 932 |
gsssrao/UnityAndroidSpeechRecognition
This repository is a Unity plugin for Android Speech Recognition (based on... |
|
Emerging |
| 933 |
rhasspy/rhasspy
Offline private voice assistant for many human languages |
|
Emerging |
| 934 |
deepgram-starters/django-transcription
Get started using Deepgram's Transcription with this Django demo app |
|
Emerging |
| 935 |
Tinkoff/voicekit-examples
Examples on how to use Tinkoff Voicekit |
|
Emerging |
| 936 |
huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller,... |
|
Emerging |
| 937 |
theblackcat102/edgedict
Working online speech recognition based on RNN Transducer. ( Trained model... |
|
Emerging |
| 938 |
maum-ai/univnet
Unofficial PyTorch Implementation of UnivNet Vocoder... |
|
Emerging |
| 939 |
synesthesiam/rhasspy
Rhasspy voice assistant for offline home automation |
|
Emerging |
| 940 |
Onuronon-lab/Shrutik
Open-source voice data collection platform for building inclusive voice... |
|
Emerging |
| 941 |
ibotplus/kbase-media
视频、音频、图片内容识别、语音转写、语音合成 / easy convert video audio image to text, and revert... |
|
Emerging |
| 942 |
daanzu/deepspeech-websocket-server
Server & client for DeepSpeech using WebSockets for real-time speech... |
|
Emerging |
| 943 |
aman179102/podvoice
Local-first CLI that turns Markdown scripts into multi-speaker podcast-style... |
|
Emerging |
| 944 |
arghyasur1991/LiveTalk-Unity
LiveTalk is a unified, high-performance talking head generation system that... |
|
Emerging |
| 945 |
yukukotani/pi-voice
Headless voice interface for the Pi Coding Agent |
|
Emerging |
| 946 |
zhao-kun/VibeVoiceFusion
VibeVoiceFusion is a full-stack, multi-speaker voice generation web system... |
|
Emerging |
| 947 |
WindQAQ/listen-attend-and-spell
Tensorflow implementation of "Listen, Attend and Spell" authored by William... |
|
Emerging |
| 948 |
cool-japan/voirs
VoiRS is a cutting-edge Text-to-Speech (TTS), Voice Recognition, Sound... |
|
Emerging |
| 949 |
thewh1teagle/piper-rs
Use piper TTS models in Rust |
|
Emerging |
| 950 |
benjaminwan/ChineseTtsTflite
Android Chinese TTS Engine Base On Tensorflow TTS , use for TfLite Models... |
|
Emerging |
| 951 |
simonw/ospeak
CLI tool for running text through OpenAI Text to speech |
|
Emerging |
| 952 |
NickZaitsev/ru-normalizr
ru-normalizr — лучший open-source нормализатор русского текста. Приводит... |
|
Emerging |
| 953 |
mush42/sonata-nvda
This add-on implements a speech synthesizer driver for NVDA using neural TTS... |
|
Emerging |
| 954 |
rishikksh20/Fre-GAN-pytorch
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis |
|
Emerging |
| 955 |
tarepan/VoiceConversionLab
Collect Voice Conversion researches |
|
Emerging |
| 956 |
rohit-lakhanpal/ai-hackathon-starter-kit
This project has been created to make AI accessible and easy for everyone.... |
|
Emerging |
| 957 |
keenresearch/KeenASR-Android-PoC
A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING:... |
|
Emerging |
| 958 |
Amirrezahmi/SelfTalker
Engage in conversation with your virtual self using AI techniques like NLP,... |
|
Emerging |
| 959 |
GuillaumeFalourd/formulas-python
Ritchie CLI formulas in Python 🐍 |
|
Emerging |
| 960 |
dictate-button/dictate-button
Customizable Web Component that adds speech-to-text dictation capabilities... |
|
Emerging |
| 961 |
nl8590687/ASRT_SDK_WinClient
An Windows client SDK and Demo software for ASRT speech recognition system.... |
|
Emerging |
| 962 |
trldvix/youtube-transcript-api
Java library which allows you to retrieve subtitles/transcripts for a single... |
|
Emerging |
| 963 |
Edresson/YourTTS
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion... |
|
Emerging |
| 964 |
chenyme/Chenyme-AAVT
这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。 |
|
Emerging |
| 965 |
earlephilhower/BackgroundAudio
Arduino library for easy, interrupt driven speech, MP3, AAC, and WAV... |
|
Emerging |
| 966 |
FelixWaweru/elevenlabs-node
Eleven Labs text to speech package for NodeJS. You can use the official... |
|
Emerging |
| 967 |
markomijic/TTS-Mod-Vault
Cross-platform Tabletop Simulator mod backup & download tool — the modern... |
|
Emerging |
| 968 |
zslrmhb/Omniverse-Virtual-Assisstant
Audio2Face Avatar with Riva SDK functionality |
|
Emerging |
| 969 |
puntorigen/podcast_tts
A class for generating realistic audio (TTS) for podcasts and dialogues. |
|
Emerging |
| 970 |
notAI-tech/IndicASR
Speeech Recognition for Indic languages. |
|
Emerging |
| 971 |
blip-radar/vatsim-parser
Parser for a variety of VATSIM-related file formats |
|
Emerging |
| 972 |
HeyHeyChicken/NOVA-NodeJS
NOVA is a customizable voice assistant made with Node.js. |
|
Emerging |
| 973 |
zw76859420/ASR_WORD
采用端到端方法构建声学模型,以字为建模单元,采用DCNN-CTC网络结构。 |
|
Emerging |
| 974 |
NotAbhinavGamerz/emotion-aware-automatic-speech-recognition
🎤 Enhance speech recognition by detecting emotions in spoken language,... |
|
Emerging |
| 975 |
neosun100/cosyvoice-docker
🎙️ CosyVoice All-in-One Docker - Production-ready TTS with Web UI, REST API... |
|
Emerging |
| 976 |
snakers4/open_stt
Open STT |
|
Emerging |
| 977 |
Rubiksman78/MonikA.I
Submod for MAS with AI based features |
|
Emerging |
| 978 |
tarun7r/SpeechAlgo
A Comprehensive Speech Processing Algorithms Library for research and production use |
|
Emerging |
| 979 |
aofdev/vue-pwa-speech
A Vue2 Performs synchronous speech recognition Speech to text Google Cloud... |
|
Emerging |
| 980 |
coqui-ai/TTS-papers
🐸 collection of TTS papers |
|
Emerging |
| 981 |
aofdev/vue-speech-streaming
A Vue2 Streaming Speech Recognition Speech to text with Google Cloud Speech |
|
Emerging |
| 982 |
gurjar1/OmniDictate
Free, open-source, real-time dictation for Windows. Runs locally (no... |
|
Emerging |
| 983 |
inclusionAI/Ming-UniAudio
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing... |
|
Emerging |
| 984 |
persephone-tools/persephone
A tool for automatic phoneme transcription |
|
Emerging |
| 985 |
r9y9/ttslearn
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python) |
|
Emerging |
| 986 |
SEPIA-Framework/sepia-stt-server
SEPIA server to support open-source speech recognition via WebSocket connection. |
|
Emerging |
| 987 |
nodef/extra-googletts
Generate speech audio from super long text through machine (via "Google... |
|
Emerging |
| 988 |
jishengpeng/WavTokenizer
[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second... |
|
Emerging |
| 989 |
dunky11/voicesmith
[WIP] VoiceSmith makes training text to speech models easy. |
|
Emerging |
| 990 |
sq2ips/sr0wx
Unowocześniony projekt automatycznej radioamatorskiej stacji pogodowej sr0wx |
|
Emerging |
| 991 |
Ashish-Patnaik/kokoclone
Voice Cloning, Now Inside Kokoro. Generate natural multilingual speech and... |
|
Emerging |
| 992 |
coqui-ai/TTS-recipes
🐸TTS recipes for different datasets |
|
Emerging |
| 993 |
sljeff/anycast
An AI-Powered Podcast App. |
|
Emerging |
| 994 |
LSimon95/megatts2
Unoffical implementation of Megatts2 |
|
Emerging |
| 995 |
Pictalk-speech-made-easy/pictalk-frontend
Pictalk is an open-source application designed to assist individuals with... |
|
Emerging |
| 996 |
mozhou-tech/kim-voice-assistant
Kim,your personal voice kit for Home Inteligence. |
|
Emerging |
| 997 |
shijincai/VibeVoice
Archive of the official Microsoft VibeVoice repository (7B & 1.5B). Backup... |
|
Emerging |
| 998 |
yrom/finetune-index-tts
IndexTTS Fine-tuning notebooks |
|
Emerging |
| 999 |
keonlee9420/Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family... |
|
Emerging |
| 1000 |
oddlama/whisper-overlay
A wayland overlay providing speech-to-text functionality for any application... |
|
Emerging |