All Voice AI Tools
8,165 tools ranked by quality score · Page 5 of 82
| # | Tool | Score | Tier |
|---|---|---|---|
| 401 |
Devansh-47/Sign-Language-To-Text-and-Speech-Conversion
This is a python application which converts american sign language into text... |
|
Established |
| 402 |
gokhaneraslan/chatterbox-finetuning
Fine-tuning toolkit for Chatterbox TTS & Chatterbox TURBO models. Supports... |
|
Established |
| 403 |
hirofumi0810/neural_sp
End-to-end ASR/LM implementation with PyTorch |
|
Established |
| 404 |
pnnbao97/sea-g2p
Fast multilingual text-to-phoneme converter for South East Asian languages. |
|
Established |
| 405 |
aws-samples/amazon-transcribe-live-call-analytics
Amazon Transcribe Live Call Analytics (LCA) Sample Solution |
|
Established |
| 406 |
PhamHuynhAnh16/Vietnamese-RVC
Dự án công cụ chuyển đổi giọng nói dành cho người Việt |
|
Established |
| 407 |
revdotcom/revai-python-sdk
Rev AI Python SDK |
|
Established |
| 408 |
google/voice-builder
An opensource text-to-speech (TTS) voice building tool |
|
Established |
| 409 |
rse/speechflow
Speech Processing Flow Graph |
|
Established |
| 410 |
DemisEom/SpecAugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain |
|
Established |
| 411 |
Saik0s/Whisperboard
The open-source iOS app that's making quality voice transcription more... |
|
Established |
| 412 |
gotev/android-speech
Android speech recognition and text to speech made easy |
|
Established |
| 413 |
mutablelogic/go-whisper
Speech-to-Text in golang |
|
Established |
| 414 |
233stone/vocotype-cli
VocoType 是一款运行在本地端侧的隐私安全语音输入工具,通过快捷键即可将语音实时转换为文字并自动输入到当前应用。支持语音转文字MCP、AI... |
|
Established |
| 415 |
sooftware/kospeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition... |
|
Established |
| 416 |
noahchalifour/rnnt-speech-recognition
End-to-end speech recognition using RNN Transducers in Tensorflow 2.0 |
|
Established |
| 417 |
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation |
|
Established |
| 418 |
descriptinc/melgan-neurips
GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis |
|
Established |
| 419 |
Kaljurand/K6nele
An Android app that offers speech-to-text user interfaces to other apps |
|
Established |
| 420 |
googleapis/nodejs-speech
This repository is deprecated. All of its content and history has been moved... |
|
Established |
| 421 |
lkuza2/java-speech-api
The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using... |
|
Established |
| 422 |
ggeop/Python-ai-assistant
Python AI assistant 🧠 |
|
Established |
| 423 |
thinhlpg/vixtts-demo
A Vietnamese Voice Cloning Text-to-Speech Model ✨ |
|
Established |
| 424 |
alumae/kaldi-gstreamer-server
Real-time full-duplex speech recognition server, based on the Kaldi toolkit... |
|
Established |
| 425 |
jcsilva/docker-kaldi-gstreamer-server
Dockerfile for kaldi-gstreamer-server. |
|
Established |
| 426 |
zw76859420/ASR_Theory
语音识别理论、论文和PPT |
|
Established |
| 427 |
speechio/chinese_text_normalization
Chinese text normalization for speech processing |
|
Established |
| 428 |
daswer123/xtts-webui
Webui for using XTTS and for finetuning it |
|
Established |
| 429 |
cosin2077/easyVoice
开源文本转语音工具,支持超长文本,多角色配音 |
|
Established |
| 430 |
i4Ds/whisper-finetune
This repository contains code for fine-tuning the Whisper speech-to-text model. |
|
Established |
| 431 |
Kyubyong/tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech... |
|
Established |
| 432 |
lifeiteng/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo... |
|
Established |
| 433 |
philipperemy/deep-speaker
Deep Speaker: an End-to-End Neural Speaker Embedding System. |
|
Established |
| 434 |
Azure-Samples/SpeechToText-WebSockets-Javascript
SDK & Sample to do speech recognition using websockets in Javascript |
|
Established |
| 435 |
botbahlul/autosrt
A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using... |
|
Established |
| 436 |
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity... |
|
Established |
| 437 |
rxlabz/speech_recognition
A Flutter plugin to use speech recognition on iOS & Android (Swift/Java) |
|
Established |
| 438 |
fatchord/WaveRNN
WaveRNN Vocoder + TTS |
|
Established |
| 439 |
riderodd/react-native-vosk
Speech recognition module for react native using Vosk library |
|
Established |
| 440 |
r9y9/deepvoice3_pytorch
PyTorch implementation of convolutional neural networks-based text-to-speech... |
|
Established |
| 441 |
symblai/getting-started-samples
Code samples to Get started quickly with Symbl's Voice SDK and APIs:... |
|
Established |
| 442 |
xcmyz/FastSpeech
The Implementation of FastSpeech based on pytorch. |
|
Established |
| 443 |
Amey-Thakur/DEEPFAKE-AUDIO
🎙️ Deepfake Audio – A neural voice cloning studio powered by SV2TTS technology. |
|
Established |
| 444 |
kadirnar/VoiceHub
VoiceHub: A Unified Inference Interface for TTS Models |
|
Established |
| 445 |
mewmix/nabu
A multi engine TTS & LLM edge computing playground with audio book features... |
|
Established |
| 446 |
lovelyterry/SmartSpeaker
一个基于云端语音识别的智能控制设备,类似于天猫精灵,小爱同学。采用的芯片为stm32f407,wm8978,esp8266。 |
|
Established |
| 447 |
alumae/gst-kaldi-nnet2-online
GStreamer plugin around Kaldi's online neural network decoder |
|
Established |
| 448 |
Jackiexiao/MTTS
A Demo of Mandarin/Chinese TTS frontend |
|
Established |
| 449 |
petermg/Chatterbox-TTS-Extended
Modified version of Chatterbox that accepts text files as input and no... |
|
Established |
| 450 |
jackaduma/CycleGAN-VC2
Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2 |
|
Established |
| 451 |
robinhad/ukrainian-tts
Ukrainian TTS (text-to-speech) using ESPNET |
|
Established |
| 452 |
j3soon/whisper-to-input
An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI... |
|
Established |
| 453 |
antor44/livestream_video
playlist4whisper manages media streams playlists for livestream_video.sh,... |
|
Established |
| 454 |
AlexandaJerry/vits-mandarin-biaobei
application of vits on mandarin tts |
|
Established |
| 455 |
ZDisket/TensorVox
Desktop application for neural speech synthesis written in C++ |
|
Established |
| 456 |
byjlw/video-analyzer
Analyze videos using LLMs, Computer Vision and Automatic Speech Recognition |
|
Established |
| 457 |
ai-bot-pro/achatbot
An open source chat bot architecture for voice/vision (and multimodal)... |
|
Established |
| 458 |
cboard-org/cboard-api
Cboard API provides backend functionality and persistence to the Cboard application |
|
Established |
| 459 |
aahl/qwen-asr2api
🎤 Qwen 3 ASR to OpenAI API, 免费STT语音识别模型 |
|
Established |
| 460 |
AI-Manga-Readers/AI_Manga_Reader
AI Manga Reader is a next-gen manga app powered by the MangaDex API,... |
|
Established |
| 461 |
woheller69/whoBIRD
Identify bird sounds in real time with this Android version of BirdNET. Bird... |
|
Established |
| 462 |
isaiahbjork/orpheus-tts-local
Run Orpheus 3B Locally With LM Studio |
|
Established |
| 463 |
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer". |
|
Established |
| 464 |
YuanGongND/whisper-at
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT:... |
|
Established |
| 465 |
arjo129/uSpeech
Speech recognition toolkit for the arduino |
|
Established |
| 466 |
Tomiinek/Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with... |
|
Established |
| 467 |
gpustack/vox-box
A text-to-speech and speech-to-text server compatible with the OpenAI API,... |
|
Established |
| 468 |
inevolin/DiscordEarsBot
A speech-to-text framework and bot for Discord. Take control of your Discord... |
|
Established |
| 469 |
FlashLabs-AI-Corp/FlashLabs-Chroma
Worlds first open-source real-time end-to-end spoken dialogue model with... |
|
Established |
| 470 |
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for... |
|
Established |
| 471 |
dmisol/flexatar-virtual-webcam
Personalized Virtual Webcam for WebRTC |
|
Established |
| 472 |
rendchevi/nix-tts
🐤 Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation |
|
Established |
| 473 |
mastashake08/speech-kit
Simplifying the Speech Synthesis and Speech Recognition engines for... |
|
Established |
| 474 |
common-voice/cv-dataset
Metadata and versioning details for the Common Voice dataset |
|
Established |
| 475 |
huihut/Facemoji
😆 A voice chatbot that can imitate your expression.... |
|
Established |
| 476 |
hirofumi0810/tensorflow_end2end_speech_recognition
End-to-End speech recognition implementation base on TensorFlow (CTC,... |
|
Established |
| 477 |
wildminder/ComfyUI-VibeVoice
ComfyUI custom node for the VibeVoice TTS. Expressive, long-form,... |
|
Established |
| 478 |
pbakondy/cordova-plugin-speechrecognition
:microphone: Cordova Plugin for Speech Recognition |
|
Established |
| 479 |
SforAiDl/Neural-Voice-Cloning-With-Few-Samples
This repository has implementation for "Neural Voice Cloning With Few Samples" |
|
Established |
| 480 |
palmerabollo/bingspeech-api-client
Microsoft Bing Speech API client in node.js |
|
Established |
| 481 |
vlomme/Multi-Tacotron-Voice-Cloning
Phoneme multilingual(Russian-English) voice cloning based on |
|
Established |
| 482 |
nari-labs/dia
A TTS model capable of generating ultra-realistic dialogue in one pass. |
|
Established |
| 483 |
oseiskar/autosubsync
Automatically synchronize subtitles with audio using machine learning |
|
Established |
| 484 |
NTT123/vietTTS
Vietnamese Text to Speech library |
|
Established |
| 485 |
iver56/torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful... |
|
Established |
| 486 |
Cvandia/nonebot-plugin-fishspeech-tts
适用于nonebot2的fish-speech和fish-audio的tts插件 |
|
Established |
| 487 |
ArthurFDLR/whisper-youtube
🔉 Youtube Videos Transcription with OpenAI's Whisper |
|
Established |
| 488 |
Saganaki22/ComfyUI-OmniVoice-TTS
OmniVoice TTS nodes for ComfyUI - Zero-shot multilingual text-to-speech with... |
|
Established |
| 489 |
WhisperSpeech/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper. |
|
Established |
| 490 |
AntoBrandi/Robotics-and-ROS-2-Learn-by-Doing-Manipulators
About 3D Printed robot arm powered by ROS 2 and Arduino and controlled via... |
|
Established |
| 491 |
cyberofficial/Synthalingua
Synthalingua - Real Time Translation |
|
Established |
| 492 |
israelg99/deepvoice
Deep Voice: Real-time Neural Text-to-Speech |
|
Established |
| 493 |
dmotz/thing-translator
📷 🗣 Point your camera at things to hear how to say them in a different language |
|
Established |
| 494 |
shangeth/wavencoder
WavEncoder is a Python library for encoding audio signals, transforms for... |
|
Established |
| 495 |
skshadan/TTS-RVC-API
Text to Speech using Coqui TTS + RVC |
|
Established |
| 496 |
hmirin/speechy
A Chrome extension for high-quality Text-to-Speech APIs like Google's... |
|
Established |
| 497 |
sdsds222/Unitale
一个基于Indextts和Qwen3TTS的 AI 有声书制作工具。利用 LLM 自动拆解剧本与识别情绪,集成多角色 TTS... |
|
Established |
| 498 |
FL33TW00D/whisper-turbo
Cross-Platform, GPU Accelerated Whisper 🏎️ |
|
Established |
| 499 |
MycroftAI/mimic-recording-studio
Mimic Recording Studio is a Docker-based application you can install to... |
|
Established |
| 500 |
bravekingzhang/text2video
半个神器👉一键文本转视频的工具 |
|
Established |