All Voice AI Tools
8,165 tools ranked by quality score · Page 20 of 82
| # | Tool | Score | Tier |
|---|---|---|---|
| 1901 |
habla-liaa/ser-with-w2v2
Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from... |
|
Emerging |
| 1902 |
AlexxIT/FasterWhisper
Faster Whisper for Home Assistant - custom integration with a local... |
|
Emerging |
| 1903 |
zhihanyang2022/gender-audio-classification
A speaker gender classifier. MFC feature engineering and a pre-trained... |
|
Emerging |
| 1904 |
anyvoiceai/Barkify
Barkify: an unoffical training implementation of Bark TTS by suno-ai |
|
Emerging |
| 1905 |
Rongjiehuang/Multiband-WaveRNN
An unofficial implement of autoregressive vocoder Multiband-WaveRNN. Audio... |
|
Emerging |
| 1906 |
daswer123/silero-tts-enhanced
Silero TTS Enhanced is a Python library that enhances the original Silero... |
|
Emerging |
| 1907 |
rerender2021/echo
A simple asr translator powered by avernakis react. |
|
Emerging |
| 1908 |
Mateusz-Dera/whisperspeech-webui
Simple WhisperSpeech web UI |
|
Emerging |
| 1909 |
zsl24/Tacotron2-Mandarin-HiFiGAN
Implementation of TTS with combination of Tacotron2 and HiFi-GAN |
|
Emerging |
| 1910 |
billiax/voxglide
Embeddable voice AI SDK for web pages — speak to fill forms, click buttons,... |
|
Emerging |
| 1911 |
KevKibe/African-Whisper
🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual... |
|
Emerging |
| 1912 |
tcsenpai/audiocoqui
A multilingual tool to convert PDF ebooks to audiobooks using XTTS v2 TTS... |
|
Emerging |
| 1913 |
e-c-k-e-r/vall-e
An unofficial PyTorch implementation of VALL-E |
|
Emerging |
| 1914 |
j3soon/speech-to-windows-input
Perform speech-to-text (STT/ASR) with Azure speech service and simulate... |
|
Emerging |
| 1915 |
asaddi/f5-tts-serve
A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful... |
|
Emerging |
| 1916 |
msalhab96/MultiSpeech
pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with... |
|
Emerging |
| 1917 |
erogol/FFTNet
FFTNet vocoder implementation |
|
Emerging |
| 1918 |
SILMA-AI/silma-tts
SILMA TTS v1 Official Repo — a Lightweight Open Bilingual Text to Speech Model |
|
Emerging |
| 1919 |
Rishav-Agarwal/Translate-Language_Translator
An android app that allows you to translate text and phrases between 90+... |
|
Emerging |
| 1920 |
zabir-nabil/bangla-tts
Bangla text to speech, Multilingual (Bangla, English) real-time speech... |
|
Emerging |
| 1921 |
nestyme/Subtitles-generator
generates transcript for video from link |
|
Emerging |
| 1922 |
Alenkar/kairos-asr
Адаптированный ASR pipeline для удобной интеграции в другие приложения на... |
|
Emerging |
| 1923 |
deepgram-devs/deepgram-demos-rust
Useful demo applications for Deepgram Voice AI APIs, using the Rust language! 🦀 |
|
Emerging |
| 1924 |
daanzu/kaldi_ag_training
Docker image and scripts for training finetuned or completely personal Kaldi... |
|
Emerging |
| 1925 |
SpenserCai/cosyvoice3.rs
Python bindings for CosyVoice3 TTS using Candle. Has the characteristics of... |
|
Emerging |
| 1926 |
Yeti47/Vosk4Unity
Vosk4Unity is a module for the Unity Engine that provides a simple way to... |
|
Emerging |
| 1927 |
ORI-Muchim/PolyLangVITS
Multi-speaker Speech Synthesis Using VITS(KO, JA, EN, ZH) |
|
Emerging |
| 1928 |
MartinMashalov/VoiceCloning
Generative voice cloning model using TTS synthesis with state-of-the-art... |
|
Emerging |
| 1929 |
opus-arc/Bibo_No-Aozora
A memory- and algorithm-driven ear-training software (still under active... |
|
Emerging |
| 1930 |
Jobix-Ai/Iso-Vox
STT 90% Solved — Isolate specific speakers from multi-speaker "cocktail... |
|
Emerging |
| 1931 |
hcy71o/MB-iSTFT-VITS-with-AutoVocoder
Incorporating AutoVocoder to MB-iSTFT-VITS |
|
Emerging |
| 1932 |
Adibian/ResGrad
Unofficial implementation of ResGrad: Residual Denoising Diffusion... |
|
Emerging |
| 1933 |
maketheproduct/flowstay
Flowstay is a MacOS app that allows instant transcription across all your... |
|
Emerging |
| 1934 |
wongfei/UEHMI
Unreal Engine Human Machine Interface |
|
Emerging |
| 1935 |
amscotti/hn-podcaster
The HackerNews Podcaster is a JavaScript application that utilizes the power... |
|
Emerging |
| 1936 |
m15-ai/Faster-Local-Voice-AI
A real-time, fully local voice AI system optimized for low-resource devices... |
|
Emerging |
| 1937 |
Sundy1219/eesen-for-thchs30
ASR for Chinese Mandarin |
|
Emerging |
| 1938 |
hipnologo/EchoForge_Studio
Multi-LLM writing and voice production workspace built with Streamlit. |
|
Emerging |
| 1939 |
hypeapps/black-mirror
A voice controlled smart mirror powered by Raspberry Pi3 and AndroidThings. |
|
Emerging |
| 1940 |
outspeed-ai/voice-devtools
Developer tools to debug and build realtime voice agents. Supports multiple models. |
|
Emerging |
| 1941 |
LetsPlayNow/Speech_AI
Speech to speech bot built with Python |
|
Emerging |
| 1942 |
The-Data-Dilemma/ParquetToHuggingFace
ParquetToHuggingFace processes raw audio data, converts it into Parquet... |
|
Emerging |
| 1943 |
lucasnewman/descript-mlx
Implementation of the Descript Audio Codec in MLX |
|
Emerging |
| 1944 |
sshh12/Recording-Bot
A bot built to record and transcribe audio fragments from Discord. |
|
Emerging |
| 1945 |
zolomohan/speech-recognition-in-javascript
Final Code for Speech Recognition in JavaScript tutorial. |
|
Emerging |
| 1946 |
aws-samples/amazon-transcribe-email-workflow
An Amazon Transcribe demo for "speech-to-text" conversion performed through... |
|
Emerging |
| 1947 |
JSON2Video/json2video-nodejs-sdk
Create videos programmatically in the cloud from NodeJS: add watermarks,... |
|
Emerging |
| 1948 |
FomTarro/word-salad
Twitch TTS redeem that uses sentence mixing instead of synthesis. |
|
Emerging |
| 1949 |
aabdurakhmanov/uzbekcha-gapir
Matnni O'zbek tilida talafuz qiluvchi desktop dastur | Text to speech... |
|
Emerging |
| 1950 |
eliangerard/simple-tts-mp3
Converts text to mp3 audio using google-tts-api, it hasn't a limit |
|
Emerging |
| 1951 |
twirapp/silero-tts-api-server
This is a simple server that uses Silero models to convert text to audio... |
|
Emerging |
| 1952 |
andi611/Conditional-SpecGAN-Tensorflow
Text-to-Speech Synthesis by Generating Spectrograms using Generative... |
|
Emerging |
| 1953 |
jesseward/azuretexttospeech
A Go library for Azure's Cognitive Services text-to-speech API. |
|
Emerging |
| 1954 |
valeriorlandini/sonus
A Max/MSP package for sound experimentation and algorithmic composition |
|
Emerging |
| 1955 |
Vazgen005/discord-virtual-micro
Says everything you type in discord for you using ai (Silero Models) |
|
Emerging |
| 1956 |
xeden3/MSSpeechServer
MSSpeechServer is a REST server based on the Microsoft Speech Platform that... |
|
Emerging |
| 1957 |
rshahamiri/SpeechVision
Speech Vision (SV) is a Dysarthric Speech Recognition System that adopts a... |
|
Emerging |
| 1958 |
hhguo/SoCodec
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications |
|
Emerging |
| 1959 |
TheCodeTraveler/XamSpeak
An iOS and Android app that will dictate text from a photo. XamSpeak... |
|
Emerging |
| 1960 |
DarioFT/ComfyUI-Qwen3-TTS
A ComfyUI custom node suite for Qwen3-TTS, supporting 1.7B and 0.6B models,... |
|
Emerging |
| 1961 |
unilight/jatts
JATTS: A modern, research-oriented Japanese Text-to-speech Open-sourced Toolkit |
|
Emerging |
| 1962 |
Kalebu/Python-Speech-Recognition-
This consist of basic examples of performing Speech Recognition in Python... |
|
Emerging |
| 1963 |
surajondev/text-to-speech
Conver text into speech |
|
Emerging |
| 1964 |
tts-hub/monotonic_alignment_search
Monotonically align text and speech |
|
Emerging |
| 1965 |
AsoSoft/AsoSoft-TTS-Speech-Corpus-for-Central-Kurdish
AsoSoft Speech Corpus for Central-Kurdish Text-To-Speech |
|
Emerging |
| 1966 |
koudounasalkis/AI4Voice
This repo contains the code for "Voice Disorder Analysis: A... |
|
Emerging |
| 1967 |
fano2458/Zhadiger-Kazakh-Language-AI
AI services project "Zhadiger" for Kazakh Language developed using NVIDIA... |
|
Emerging |
| 1968 |
lottev1991/Project-AIdol-Public-English-Dataset
Public female English corpus used for Project AI❤dol |
|
Emerging |
| 1969 |
johnGettings/LIHQ
Long-Inference, High Quality Synthetic Speaker (AI avatar/ AI presenter) |
|
Emerging |
| 1970 |
yousefkotp/Egyptian-Arabic-ASR-and-Diarization
The official submission from Speech Squad team for the MTC-AIC 2 competition... |
|
Emerging |
| 1971 |
NONAN23x/WhisperingNova
An AI voice changer harnessing the power of Open AI and VoiceVox for... |
|
Emerging |
| 1972 |
clarinsi/Slovene_ASR_e2e
Automatic Speech Recognition tool |
|
Emerging |
| 1973 |
lars76/forced-alignment-chinese
Mandarin Chinese audio datasets aligned with Montreal Forced Aligner |
|
Emerging |
| 1974 |
sp-squared/Turkic-Languages-Audio-to-Text-Transcription
Open-source Automatic Speech Recognition (ASR) pipeline for Bashkir... |
|
Emerging |
| 1975 |
binzhouchn/masr
中文语音识别系列,读者可以借助它快速训练属于自己的中文语音识别模型,或直接使用预训练模型测试效果。 |
|
Emerging |
| 1976 |
troykelly/live-news-break
An advanced tool designed for creating automated news bulletins. It... |
|
Emerging |
| 1977 |
mikopbx/ModuleRHVoice
Text to speech voice generator by the RHVoice algoritm |
|
Emerging |
| 1978 |
XilinJia/Podcini
Open source podcast instrument for Android supporting contents from YouTube... |
|
Emerging |
| 1979 |
nemoramo/acoustic_model
This is a sub-repository in building to create acoustic model in Mandarin... |
|
Emerging |
| 1980 |
revsic/speechset
Numpy-librosa implementation of Speech dataset pipeline |
|
Emerging |
| 1981 |
OPEXGroup/ITCC.YandexSpeechKitClient
Cross-platform client for Yandex SpeechKit Cloud API |
|
Emerging |
| 1982 |
zassou65535/VITS
VITSによるテキスト読み上げ器&ボイスチェンジャー |
|
Emerging |
| 1983 |
kcitlyn/PolyScribe_Desktop
Fully-offline transcription and translator w/ speech-to-text and... |
|
Emerging |
| 1984 |
betaoverflow/donna
Transform your smart devices to intelligent communicators. |
|
Emerging |
| 1985 |
xnmeet/voi
一个基于 [Bob](https://bobtranslate.com/) 的文本转语音插件,使用 Kokoro 本地部署模型作为语音合成服务。 |
|
Emerging |
| 1986 |
hamzaehsan97/Speech_Recognition_CNN
CNN (Convolutional Neural Networks) Speech Recognition |
|
Emerging |
| 1987 |
vectominist/End-to-end-ASR-Pytorch-DLHLP
Joint CTC-Attention End-to-end Speech Recognition - PyTorch Implementation... |
|
Emerging |
| 1988 |
black-roland/homeassistant-salutespeech
SaluteSpeech integration for Home Assistant providing speech-to-text and... |
|
Emerging |
| 1989 |
AlimTleuliyev/image-to-audio
Image Captioning and Text-to-Speech |
|
Emerging |
| 1990 |
LlmKira/fast-langdetect
⚡️ 80x faster Fasttext language detection out of the box | Split text by language |
|
Emerging |
| 1991 |
mravanelli/pySpeechRev
This python code performs an efficient speech reverberation starting from a... |
|
Emerging |
| 1992 |
reyniel26/bleepy
Bleepy is a Python program that can block Tagalog and English profanity in... |
|
Emerging |
| 1993 |
kaloprojects/KALO-ESP32-Voice-Chat-AI-Friends
ESP32-based voice device for chatting with multiple custom AI bots.... |
|
Emerging |
| 1994 |
Wendison/FCL-taco2
Official implementation of FCL-taco2: Fast, Controllable and Lightweight... |
|
Emerging |
| 1995 |
DeutscheKI/tevr-asr-tool
State-of-the-art (ranked #1 Aug 2022) German Speech Recognition in 284 lines... |
|
Emerging |
| 1996 |
geekgirljoy/PHP
Examples of my PHP Code |
|
Emerging |
| 1997 |
tuanh123789/AdaSpeech
An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for... |
|
Emerging |
| 1998 |
nikhilunni/demucs-rs
Rust powered waveform source separation |
|
Emerging |
| 1999 |
duckysmacky/signsense
An Android app for translating dactyl sign language into text |
|
Emerging |
| 2000 |
iGerman00/Pollyduble
An experimental proof-of-concept script to automatically dub videos to... |
|
Emerging |