All Voice AI Tools
8,165 tools ranked by quality score · Page 24 of 82
| # | Tool | Score | Tier |
|---|---|---|---|
| 2301 |
xiaominfc/aliyun_nls_c_demo
阿里云的实时语音识别服务(ASR)没有提供C的SDK,项目中需要,看了它java sdk的实现,就做了个C版demo |
|
Emerging |
| 2302 |
a-n-rose/Python-Sound-Tool
SoundPy (alpha stage) is a research-based python package for speech and... |
|
Emerging |
| 2303 |
renaudjenny/TellTime
iOS application to tell the time in the British way 🇬🇧⏰ |
|
Emerging |
| 2304 |
jreremy/conformer
Pytorch implementation of conformer with with training script for end-to-end... |
|
Emerging |
| 2305 |
SohamRatnaparkhi/Voice-Assistant
Voice Assistant coded in Python! |
|
Emerging |
| 2306 |
MingLunHan/CIF-PyTorch
[ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech... |
|
Emerging |
| 2307 |
alitahir4024/Text-To-Speach-Javascript
A creative project to give voice to your words. |
|
Emerging |
| 2308 |
huuquyet/PhoWhisper-next
Demo using PhoWhisper models of VinAI built with Transformers.js + Next.js |
|
Emerging |
| 2309 |
PareekshithPalat/AETHER---Personal-Assistant
AETHER is a voice-activated Python personal assistant that responds to... |
|
Emerging |
| 2310 |
holgern/pykokoro
A Python library for Kokoro TTS (Text-to-Speech) using ONNX runtime. |
|
Emerging |
| 2311 |
manhph2211/ML-Deployment
Pushing Deep Learning models into production using torchserve, kubernetes... |
|
Emerging |
| 2312 |
aria-music/zundacord
Japanese Text-to-speech bot for Discord, powered by VOICEVOX |
|
Emerging |
| 2313 |
Aculeasis/rhvoice-proxy
High-level interface for RHVoice library |
|
Emerging |
| 2314 |
tuanio/noisy-student-training-asr
Pytorch implementation of Noisy Student Training for Automatic Speech... |
|
Emerging |
| 2315 |
efeslab/LiteASR
[EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with... |
|
Emerging |
| 2316 |
IS2AI/ISSAI_SAIDA_Kazakh_ASR
the first industrial-scale open-source Kazakh speech corpus. KSC2 corpus... |
|
Emerging |
| 2317 |
mathigatti/RealTimeSingingSynthesizer
Live Coding Singing Synthesizer. Python sinsy-NG wrapper. |
|
Emerging |
| 2318 |
chaonan99/ppt_presenter
Convert ppt to video with audio track, using text to speech synthesis |
|
Emerging |
| 2319 |
LlmKira/VitsServer
🌻 VITS ONNX TTS server designed for fast inference 🔥 |
|
Emerging |
| 2320 |
andriyadi/Maix-SpeechRecognizer
Speech Recognition or Wake Word detection demo, developed using Maixduino... |
|
Emerging |
| 2321 |
rishikksh20/AudioMAE-pytorch
Unofficial PyTorch implementation of Masked Autoencoders that Listen |
|
Emerging |
| 2322 |
thotnd173389/SpeechCommand
The project aims to use keyword spotting streaming in a real-time offline... |
|
Emerging |
| 2323 |
fcakyon/pywhisper
openai/whisper + extra features |
|
Emerging |
| 2324 |
h4rm0n1c/NetTTS
A Retro-modern SAPI 4.0 TTS Client with Network Connectivity and custom... |
|
Emerging |
| 2325 |
kwea123/Unity_live_caption
Use Google Speech-to-Text API to do real-time live stream caption on Unity!... |
|
Emerging |
| 2326 |
18F/tts-buy-cloudgov-vulnerability-scanner
Solicitation and acquisition documents created for the cloud.gov... |
|
Emerging |
| 2327 |
sinProject-Inc/talk
Listening and Speaking |
|
Emerging |
| 2328 |
Ikaros-521/FunASR_WS
基于FunASR官方Demo修改的WS服务端,配合FastAPI提供HTTP服务,可以在浏览器中进行实时ASR测试 |
|
Emerging |
| 2329 |
Lqm1/openai-workers-ai
A Cloudflare Workers-based, OpenAI-compatible API project that provides... |
|
Emerging |
| 2330 |
ryanlintott/OEVoice
Old English text-to-speech using AVSpeechSynthesis and IPA pronunciations. |
|
Emerging |
| 2331 |
sooftware/End-to-End-Speech-Recognition-Models
PyTorch implementation of automatic speech recognition models. |
|
Emerging |
| 2332 |
jashutch/zeddal
Turn your voice into intelligent, linked notes inside Obsidian |
|
Emerging |
| 2333 |
GravityPoet/ChordVox
Your voice is the fastest keyboard. Local AI voice input — speak, AI polish,... |
|
Emerging |
| 2334 |
litagin02/vits-japros-webui
日本語TTS(VITS)の学習と音声合成のGradio WebUI |
|
Emerging |
| 2335 |
rollingstarky/Python-Voice-Assistant
A Python based Voice Assistant like Siri |
|
Emerging |
| 2336 |
cosmoquester/speech-recognition
Develop speech recognition models with Tensorflow 2 |
|
Emerging |
| 2337 |
pinch-eng/pinch-python-sdk
Real-time voice translation SDK |
|
Emerging |
| 2338 |
tjunttila/pdf2video
A tool for making videos from PDF presentations. |
|
Emerging |
| 2339 |
m15-ai/Local-Voice
A real-time, offline voice assistant for Linux and Raspberry Pi. Uses local... |
|
Emerging |
| 2340 |
simonesiega-academics/culinary-ai-assistant
AI-powered culinary assistant that stores structured data in a tabular... |
|
Emerging |
| 2341 |
emiliioaguirre/youtube-live-tts
Real-time YouTube Live Chat Text-to-Speech (TTS) using ElevenLabs AI voices |
|
Emerging |
| 2342 |
IOriens/whisper-video
Generate subtitles for all the videos in a folder with OpenAI's Whisper... |
|
Emerging |
| 2343 |
jaganadhg/nemoexamples
Experiments with NVIDIA NeMo |
|
Emerging |
| 2344 |
Ananya-0306/Jarvis-desktop-assistant
This is the New Jarvis AI Project it will do some functionality followed by... |
|
Emerging |
| 2345 |
robotology/natural-speech
This repository contains a codebase to build automatic speech recognition... |
|
Emerging |
| 2346 |
LEMAS-Project/LEMAS-TTS
LEMAS‑TTS is a multilingual zero‑shot text‑to‑speech system, supporting 10... |
|
Emerging |
| 2347 |
EtienneAb3d/WhisperTimeSync
Synchronize Whisper's timestamps over an existing accurate transcription |
|
Emerging |
| 2348 |
elbruno/ElBruno.QwenTTS
Qwen3-TTS ONNX export pipeline + C# .NET 10 console app for local voice generation |
|
Emerging |
| 2349 |
DivineUX23/Audio-to-Audio-translation
Imagine translating your speech or anybody's speech to any language you want... |
|
Emerging |
| 2350 |
matlab-deep-learning/deepspeech
This repo provides the pretrained DeepSpeech model in MATLAB. The model is... |
|
Emerging |
| 2351 |
cadia-lvl/WebRICE
WebRICE (Web Reader ICE) is an open source web reader in development at... |
|
Emerging |
| 2352 |
Mokkapps/parents-soundboard
A soundboard developed for parents to be able to play often needed phrases like "No" |
|
Emerging |
| 2353 |
jlia0/RealityTalk
RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling |
|
Emerging |
| 2354 |
Sukumar9944/Speech-to-Text-with-ChatGPT
This Python application combines speech recognition with the power of... |
|
Emerging |
| 2355 |
speechly/react-example-repo-filtering
An example app for filtering data with Speechly and React |
|
Emerging |
| 2356 |
hkdb/offline-tts
A Chrome extension that reads web pages and PDFs aloud using Supertonic's... |
|
Emerging |
| 2357 |
Zuellni/LLaSA-WebUI
LLaSA WebUI using ExLlamaV2 and FastAPI. |
|
Emerging |
| 2358 |
xuchennlp/S2T
The project for speech translation |
|
Emerging |
| 2359 |
i-bardinov/Godot-Android-Text-to-Speech
Godot Android Text to Speech plugin for Godot Engine 3.4 or higher |
|
Emerging |
| 2360 |
ARK018/multi-voice-sdk
A universal Text-to-Speech (TTS) SDK . Easily generate and manage audio... |
|
Emerging |
| 2361 |
18F/tts-buy-code-review
Solicitation documents for the code review procurement being undertaken by TTS. |
|
Emerging |
| 2362 |
WelkinYang/Learn2Sing2.0
Diffusion and Mutual Information-Based Target Speaker SVS by Learning from... |
|
Emerging |
| 2363 |
StanGirard/speechdigest
Audio to summary with openAI Whisper & GPT 3.5/4 using streamlit |
|
Emerging |
| 2364 |
opencog/TinyCog
Small Robot, Toy Robot platform |
|
Emerging |
| 2365 |
nearkyh/AWS-Polly
How to use Amazon Polly TTS(Text To Speech) |
|
Emerging |
| 2366 |
FS-17/SpeechDataBuilder
Browser-based open-source tool for creating high-quality TTS/STT datasets.... |
|
Emerging |
| 2367 |
LianjiaTech/bella-whisper
bella-whisper是一系列基于OpenAI... |
|
Emerging |
| 2368 |
seven-io/go-client
Official Go API Client for seven.io |
|
Emerging |
| 2369 |
DarmorGamz/Youtube-Shorts-Generator
Harness OpenAI's power to effortlessly create YouTube Shorts with this... |
|
Emerging |
| 2370 |
alexykn/TorchTS
A modern text to speech frontend for Kokoro-82M |
|
Emerging |
| 2371 |
stensmir/mimir
Offline voice-to-text for macOS. No cloud, no tracking. |
|
Emerging |
| 2372 |
indigane/wyoming-android-tts
Use your Android device's TTS engines in Home Assistant via the Wyoming protocol. |
|
Emerging |
| 2373 |
Garden-Tree/yomi-KAI
yomi-KAIはDiscordのテキストチャンネルに送られた文章をボイスチャンネルで読み上げるbotです。 |
|
Emerging |
| 2374 |
WindQAQ/tensorflow-wavenet
Implementation of WaveNet network based on Tensorflow. |
|
Emerging |
| 2375 |
SingAvi/SpeechToText
Simple python script to convert live speech or any audio file to text using... |
|
Emerging |
| 2376 |
VirtualZer0/StreamTalkerClient
Cross-platform desktop app that reads Twitch and VK Play chat aloud using AI... |
|
Emerging |
| 2377 |
bobokick/Microsoft-Speech-API_Guide
微软的语音引擎SAPI的使用及API描述 |
|
Emerging |
| 2378 |
lucidprogrammer/youtube-vision-transcriber
AI-powered pipeline that converts YouTube videos into polished articles... |
|
Emerging |
| 2379 |
ElishaAz/mau_local_stt
A Maubot to transcribe audio messages using local open-source libraries |
|
Emerging |
| 2380 |
yuryleb/garmin-russian-tts-voices
Дополнения и исправления для русских TTS-голосов из навигаторов Garmin |
|
Emerging |
| 2381 |
mhshajib/avro-phonetic-go
Avro-style Banglish → বাংলা transliteration engine for Go, using trie-based... |
|
Emerging |
| 2382 |
JohannLai/audio-to-text
Convert audio to text and summary just need to input the audio link. |
|
Emerging |
| 2383 |
leokwsw/OpenAI-TTS-Gradio
Use OpenAI TTS(Text to Speech) API with Gradio |
|
Emerging |
| 2384 |
taeyoun811/Whisfusion
Whisfusion: Parallel ASR Decoding via a Diffusion Transformer |
|
Emerging |
| 2385 |
danielga/gmcl_speech
A module for Garry's Mod that provides speech recognition interfaces to developers. |
|
Emerging |
| 2386 |
voice-cloning-app/Voice-API
API template for deploying tacotron2 voices |
|
Emerging |
| 2387 |
daveshap/keras_asr
ASR experiment using Google's Universal Sentence Encoder |
|
Emerging |
| 2388 |
Voice-Privacy-Challenge/Voice-Privacy-Challenge-2022
Baseline Recipe for VoicePrivacy Challenge 2022: anonymization systems and... |
|
Emerging |
| 2389 |
tabahi/Mel-Spectrum-Analyzer
Online web based mel-spectrum, power spectrum, FFT analyzer for speech and... |
|
Emerging |
| 2390 |
RoyNkem/SwiftUI-AI-Voice-Assistant
A multi-platform app for voice-based interactions built using SwiftUI with... |
|
Emerging |
| 2391 |
sureshnswamy/tamil-text2voice
Text to speech tool for Tamil language |
|
Emerging |
| 2392 |
hari-huynh/viVQA-voice-assistant
Voice assistant using Multimodal LLMs - LLaVA-NeXT (Mistral 7B) finetuned &... |
|
Emerging |
| 2393 |
msalhab96/SpeeQ
A framework for automatic speech recognition |
|
Emerging |
| 2394 |
GreenSheep01201/claw-voice-chat
Push-to-talk voice chat interface for OpenClaw channels |
|
Emerging |
| 2395 |
uysalemre/Voice-Mail
Python, Django, Text to Speech, Speech to Text, AJAX, Gmail API, Email... |
|
Emerging |
| 2396 |
adasegroup/OSM-one-shot-multispeaker
Framework for one-shot multispeaker system based on Deep Learning |
|
Emerging |
| 2397 |
T-vK/Termux-DeepSpeech
Open source offline speech recognition for Android using Mozilla's... |
|
Emerging |
| 2398 |
ttuleyb/TortoiseTTS-GUI
GradioUI for TortoiseTTS voice generation |
|
Emerging |
| 2399 |
ekleziast/kiwi-voice
Voice interface for OpenClaw with speaker recognition, voice-gated security,... |
|
Emerging |
| 2400 |
rishikksh20/TalkNet2-pytorch
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for... |
|
Emerging |