All Voice AI Tools
8,165 tools ranked by quality score · Page 8 of 82
| # | Tool | Score | Tier |
|---|---|---|---|
| 701 |
keonlee9420/Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive... |
|
Emerging |
| 702 |
microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech |
|
Emerging |
| 703 |
mdangschat/ctc-asr
End-to-end trained speech recognition system, based on RNNs and the... |
|
Emerging |
| 704 |
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation.... |
|
Emerging |
| 705 |
StephenVinouze/KontinuousSpeechRecognizer
A Kotlin Speech Recognizer that runs continuously and is triggered with an... |
|
Emerging |
| 706 |
stts-se/wikispeech-server
The main API for Wikispeech |
|
Emerging |
| 707 |
deepgram-starters/flask-text-to-speech
Get started using Deepgram's Text-to-Speech with this Flask demo app |
|
Emerging |
| 708 |
bricewalker/Hey-Jetson
Deep Learning based Automatic Speech Recognition with attention for the... |
|
Emerging |
| 709 |
GeekyWizKid/video_processing_service
Video Processing Service is an automated video processing service that... |
|
Emerging |
| 710 |
yy4382/tts-importer
轻松将 Azure TTS 语音合成服务导入阅读软件。现支持阅读(legado)、爱阅记、源阅读。 |
|
Emerging |
| 711 |
Umesh-01/Python-Assistant
Python Assistant (PA) is a voice command based assistant service written in... |
|
Emerging |
| 712 |
wildminder/ComfyUI-VoxCPM
ComfyUI node for highly expressive speech and realistic zero-shot voice cloning |
|
Emerging |
| 713 |
analyticsinmotion/werx
🐍📦 Easy-to-use Python package for lightning-fast Word Error Rate (WER) analysis |
|
Emerging |
| 714 |
exPHAT/SwiftWhisper
🎤 The easiest way to transcribe audio in Swift |
|
Emerging |
| 715 |
Kajitsy/Emilia
Emilia - Desktop Character.AI Client |
|
Emerging |
| 716 |
Nikorasu/LiveWhisper
A nearly-live implementation of OpenAI's Whisper, using sounddevice.... |
|
Emerging |
| 717 |
DrewThomasson/VoxNovel
VoxNovel: generate audiobooks giving each character a different voice actor. |
|
Emerging |
| 718 |
saidsef/tika-document-to-text
Apache Tika extract text and metadata from any document format with this... |
|
Emerging |
| 719 |
BandarLabs/gitpodcast
Convert any git repository into an engaging podcast |
|
Emerging |
| 720 |
rishikksh20/AdaSpeech
AdaSpeech: Adaptive Text to Speech for Custom Voice |
|
Emerging |
| 721 |
tiberiu44/TTS-Cube
End-2-end speech synthesis with recurrent neural networks |
|
Emerging |
| 722 |
madhavmk/Noise2Noise-audio_denoising_without_clean_training_data
Source code for the paper titled "Speech Denoising without Clean Training... |
|
Emerging |
| 723 |
KyungsuKim42/tokensynth
The official implementation of TokenSynth (ICASSP 2025) |
|
Emerging |
| 724 |
Jackiexiao/zhtts
A demo of zh/Chinese Text to Speech system run on CPU in real time. 中文实时语音合成系统Demo |
|
Emerging |
| 725 |
longluo/EbookReader
The EbookReader Android App. Support file format like epub, pdf, txt, html,... |
|
Emerging |
| 726 |
shhossain/BanglaTTS
BanglaTTS is a text-to-speech (TTS) system for Bangla language that works in... |
|
Emerging |
| 727 |
coqui-ai/STT
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying... |
|
Emerging |
| 728 |
ycyy/edge-tts-webui
edge-tts webui |
|
Emerging |
| 729 |
voicekit-team/T-one
T-one is a high-performance streaming ASR pipeline for Russian, specialized... |
|
Emerging |
| 730 |
atomicoo/tacotron2-mandarin
Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on... |
|
Emerging |
| 731 |
nari-labs/dia2
TTS model capable of streaming conversational audio in realtime. |
|
Emerging |
| 732 |
mlalma/MisakiSwift
Swift port of Misaki G2P (grapheme-to-phoneme) library that can be used e.g.... |
|
Emerging |
| 733 |
gabriele-mastrapasqua/qwen3-tts
Pure C inference engine for Qwen3-TTS text-to-speech. No Python, no PyTorch... |
|
Emerging |
| 734 |
BatuhanYilmaz26/Auto-Subtitled-Video-Generator
Input a YouTube video link or upload a video file and get a video with subtitles. |
|
Emerging |
| 735 |
IBM/MAX-Speech-to-Text-Converter
Converts spoken words into text form. |
|
Emerging |
| 736 |
BogiHsu/Tacotron2-PyTorch
Yet another PyTorch implementation of Tacotron 2 with reduction factor and... |
|
Emerging |
| 737 |
1038lab/ComfyUI-EdgeTTS
ComfyUI-EdgeTTS is a powerful text-to-speech node for ComfyUI, leveraging... |
|
Emerging |
| 738 |
EgorLakomkin/KTSpeechCrawler
Automatically constructing corpus for automatic speech recognition from... |
|
Emerging |
| 739 |
liuli-moe/to-the-stars
魔法少女小圆 飞向星空 中文翻译 |
|
Emerging |
| 740 |
shamspias/vibevoice-studio
Beautiful voice app: record or upload to train a voice, generate speech from... |
|
Emerging |
| 741 |
Purple-Horizons/openclaw-voice
🦞 Open-source browser-based voice chat for AI assistants. Self-hosted,... |
|
Emerging |
| 742 |
soobinseo/Tacotron-pytorch
Pytorch implementation of Tacotron |
|
Emerging |
| 743 |
xue-fei/sherpa-onnx-unity
sherpa-onnx in unity |
|
Emerging |
| 744 |
amazon-archives/amazon-polly-sample
Sample application for Amazon Polly. Allows to convert any blog into an... |
|
Emerging |
| 745 |
xenova/whisper-web
ML-powered speech recognition directly in your browser |
|
Emerging |
| 746 |
VidyasagarMSC/WatBot
An Android ChatBot powered by IBM Watson Services (Assistant V1,... |
|
Emerging |
| 747 |
louiskirsch/speechT
An opensource speech-to-text software written in tensorflow |
|
Emerging |
| 748 |
gitmylo/audio-webui
A webui for different audio related Neural Networks |
|
Emerging |
| 749 |
n0name45/node-red-contrib-yandex-station-management
Модуль node-red-contrib-yandex-station-management для управления умными... |
|
Emerging |
| 750 |
themanyone/whisper_dictation
Private voice keyboard, AI chat, images, webcam, recordings, voice control... |
|
Emerging |
| 751 |
jimbozhang/kaldi-gop
Kaldi-based goodness of pronunciation (GOP) |
|
Emerging |
| 752 |
prateekkalra/Selection-js
A lightweight javascipt library which provides users with a set of options... |
|
Emerging |
| 753 |
ArchishmanSengupta/autovoiceevals
A self-improving loop for voice AI agents. Uses karpathy's autoresearch as... |
|
Emerging |
| 754 |
frostming/tetos
A unified interface for multiple Text-to-Speech (TTS) providers. |
|
Emerging |
| 755 |
supershaneski/openai-whisper-talk
openai-whisper-talk is a sample voice conversation application powered by... |
|
Emerging |
| 756 |
Kyubyong/tacotron_asr
Speech Recognition Using Tacotron |
|
Emerging |
| 757 |
embium/solverecaptchas
An async Python library to automate solving ReCAPTCHA v2 using Playwright. |
|
Emerging |
| 758 |
linto-ai/linto-studio
Transcription and annotation interface for recorded audio or video files |
|
Emerging |
| 759 |
ivcylc/OpenMusic
OpenMusic: SOTA Text-to-music (TTM) Generation |
|
Emerging |
| 760 |
modelscope/ClearerVoice-Studio
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained... |
|
Emerging |
| 761 |
joelpurra/talkie
Text-to-speech browser extension button. Select text on any web page, and... |
|
Emerging |
| 762 |
coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies |
|
Emerging |
| 763 |
ImNimboss/uberduck
A synchronous and asynchronous API wrapper for the UberDuck text-to-speech... |
|
Emerging |
| 764 |
Open-Speech-EkStep/vakyansh-models
Open source speech to text models for Indic Languages |
|
Emerging |
| 765 |
goxr3plus/java-google-speech-api
🙊 Speech Recognition , Text To Speech , Google Translate |
|
Emerging |
| 766 |
harmlessman/PAFTS
PAFTS : Library That Preprocessing Audio For TTS. |
|
Emerging |
| 767 |
sipeed/Maix-Speech
Maix Speech AI lib, a fast and small speech lib running on embedded devices,... |
|
Emerging |
| 768 |
deterministic-algorithms-lab/Cross-Lingual-Voice-Cloning
Tacotron 2 - PyTorch implementation with faster-than-realtime inference... |
|
Emerging |
| 769 |
maxwellobi/Android-Speech-Recognition
Continuous speech recognition library for Android with options to use... |
|
Emerging |
| 770 |
phatjkk/SpeakIt_Vietnamese_TTS
Vietnamese Text-to-Speech on Windows Project (zalo-speech) |
|
Emerging |
| 771 |
mark-rez/TikTok-Voice-TTS
Simple Python script to interact with the TikTok TTS Voices. |
|
Emerging |
| 772 |
DePasqualeOrg/mlx-swift-audio
Swift tools for text to speech (TTS) and speech to text (STT) powered by MLX |
|
Emerging |
| 773 |
HumeAI/hume-react-sdk
Packages for using Hume AI and React |
|
Emerging |
| 774 |
keonlee9420/Expressive-FastSpeech2
PyTorch Implementation of Non-autoregressive Expressive (emotional,... |
|
Emerging |
| 775 |
gooofy/py-picotts
Python wrappers around SVOX Pico TTS |
|
Emerging |
| 776 |
ORI-Muchim/Efficient-Speech
Lightweight Korean TTS Model based on FastSpeech2 |
|
Emerging |
| 777 |
SARIT42/lipsyncr
LipSyncr is a lip reading web app based on the LipNet model that can lip... |
|
Emerging |
| 778 |
smeetrs/deep_avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper. |
|
Emerging |
| 779 |
themanyone/voice_typing
State-of-the-art offline (or networked) voice typing everywhere + text... |
|
Emerging |
| 780 |
cvqluu/simple_diarizer
Simplified diarization pipeline using some pretrained models - audio file to... |
|
Emerging |
| 781 |
revdotcom/fstalign
An efficient OpenFST-based tool for calculating WER and aligning two... |
|
Emerging |
| 782 |
d4n3436/Fergun
A utility Discord bot written in C# using Discord.Net |
|
Emerging |
| 783 |
kkoutini/PaSST
Efficient Training of Audio Transformers with Patchout |
|
Emerging |
| 784 |
IhorShevchuk/RHVoice-spm
A free and open source speech synthesizer with support for a lot languages... |
|
Emerging |
| 785 |
eel-brah/kokorodoki
Natural-sounding Text-to-Speech App that fits anywhere. Fast, Real-Time and flexible. |
|
Emerging |
| 786 |
alexram1313/text-to-speech-sample
Python3 Text to Speech Video Sample |
|
Emerging |
| 787 |
freewym/espresso
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit |
|
Emerging |
| 788 |
yl4579/StyleTTS
Official Implementation of StyleTTS |
|
Emerging |
| 789 |
microsoft/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing |
|
Emerging |
| 790 |
funcwj/aps
A personal toolkit for single/multi-channel speech recognition & enhancement... |
|
Emerging |
| 791 |
acoti/articulate.js
A jQuery plugin that lets the browser speak to you. |
|
Emerging |
| 792 |
hans00/phonemize
Pure JS fast phonemizer with rule-based G2P prediction |
|
Emerging |
| 793 |
cpfair/quran-align
Word-accurate timestamps for Qur'anic audio. |
|
Emerging |
| 794 |
sl5net/SL5-aura-service
Your offline, privacy-first voice assistant framework. Transform speech into... |
|
Emerging |
| 795 |
VideotronicMaker/LM-Studio-Voice-Conversation
Python app for LM Studio-enhanced voice conversations with local LLMs. Uses... |
|
Emerging |
| 796 |
overcrash66/OpenTranslator
Open Translator: Speech To Speech and Speech to text Translator with voice... |
|
Emerging |
| 797 |
TheNewC0der-24/Textonus
Voice to Text Online Notepad Professional, Accurate & Free Speech... |
|
Emerging |
| 798 |
arihanv/Shush
Shush is an app that deploys a WhisperV3 model with Flash Attention v2 on... |
|
Emerging |
| 799 |
jeroenterheerdt/pycsspeechtts
Python (py) library to use Microsofts Cognitive Services Speech (csspeech)... |
|
Emerging |
| 800 |
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System |
|
Emerging |