All Voice AI Tools
8,165 tools ranked by quality score · Page 9 of 82
| # | Tool | Score | Tier |
|---|---|---|---|
| 801 |
pinguy/kokoro-tts-addon
Local neural TTS for Browsers: fast, expressive, and offline—runs on modest hardware. |
|
Emerging |
| 802 |
oliverguhr/wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model. |
|
Emerging |
| 803 |
shashikg/WhisperS2T
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting... |
|
Emerging |
| 804 |
tarun7r/Vocal-Agent
Cascading voice assistant combining real-time speech recognition, AI... |
|
Emerging |
| 805 |
drien/tts-joinery
Stitch together text-to-speech over 4096 characters via the OpenAI API |
|
Emerging |
| 806 |
duncan3dc/speaker
A PHP library to convert text to speech using various web services |
|
Emerging |
| 807 |
ishandutta2007/Awesome-Text-to-Speech
🎤 A curated list of the latest and most influential tools, models, and... |
|
Emerging |
| 808 |
black-roland/homeassistant-yandex-speechkit
Yandex SpeechKit integration for Home Assistant providing speech-to-text and... |
|
Emerging |
| 809 |
Evil0ctal/Fast-Powerful-Whisper-AI-Services-API
⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper... |
|
Emerging |
| 810 |
HKoon/ChatTTS-OpenVoice
Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your... |
|
Emerging |
| 811 |
leaonline/easy-speech
🔊 Cross browser Speech Synthesis also known as Text to speech or TTS; no... |
|
Emerging |
| 812 |
chenkui164/FastASR
这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。... |
|
Emerging |
| 813 |
neosapience/mlp-singer
Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing... |
|
Emerging |
| 814 |
deepgram-starters/node-text-to-speech
Get started using Deepgram's Text-to-Speech with this Node demo app |
|
Emerging |
| 815 |
patrickenfuego/Chapterize-Audiobooks
Split a single, monolithic mp3 audiobook file into chapters using Machine... |
|
Emerging |
| 816 |
joethei/obsidian-tts
Text to speech for Obsidian. Hear your notes. |
|
Emerging |
| 817 |
tabahi/formantfeatures
Extract frequency, power, width and dissonance of formants from wav files |
|
Emerging |
| 818 |
areebbeigh/winspeech
Speech recognition and synthesis library for Windows - Python 2 and 3. |
|
Emerging |
| 819 |
jhuus/HawkEars1
⚠️ HawkEars 1.0 (obsolete). See HawkEars 2.0 → https://github.com/jhuus/HawkEars |
|
Emerging |
| 820 |
morioka/tiny-openai-whisper-api
OpenAI Whisper API-style local server, runnig on FastAPI |
|
Emerging |
| 821 |
eduardolat/kokoro-web
🔊 Kokoro Web: Free AI text-to-speech, online or self-hosted, OpenAI compatible! |
|
Emerging |
| 822 |
mpaepper/vibevoice
Fast local speech-to-text for any app using faster-whisper |
|
Emerging |
| 823 |
dhruvyad/uttertype
Short code for dictation using OpenAI Whisper for transcription. |
|
Emerging |
| 824 |
Chris10M/Lip2Speech
A pipeline to read lips and generate speech for the read content, i.e Lip to... |
|
Emerging |
| 825 |
PaddlePaddle/Parakeet
PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer... |
|
Emerging |
| 826 |
sldimitrov/english_learning_system
English Learning System I have developed in order to help others in... |
|
Emerging |
| 827 |
HadrienGardeur/web-speech-recommended-voices
A list of recommended voices for the Web Speech API |
|
Emerging |
| 828 |
pritishyuvraj/Voice-Conversion-GAN
Voice Conversion using Cycle GAN's For Non-Parallel Data |
|
Emerging |
| 829 |
gtreshchev/RuntimeSpeechRecognizer
Cross-platform, real-time, offline speech recognition plugin for Unreal... |
|
Emerging |
| 830 |
yc9701/pansori
Tools for ASR Corpus Generation from Online Video |
|
Emerging |
| 831 |
KernelInterrupt/whisper4dart
whisper4dart is a dart wrapper for whisper.cpp, designed to offer an... |
|
Emerging |
| 832 |
pluja/whishper
Transcribe any audio to text, translate and edit subtitles 100% locally with... |
|
Emerging |
| 833 |
BuildWithAIs/voicekey
Voice to text, one key to input. |
|
Emerging |
| 834 |
adi-gov-tw/Taiwan-Tongues-ASR-CE
Taiwan Tongues ASR CE 是一個開源語音辨識(Automatic Speech Recognition,... |
|
Emerging |
| 835 |
ide8/tacotron2
Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow |
|
Emerging |
| 836 |
jianchang512/clone-voice
A sound cloning tool with a web interface, using your voice or any sound to... |
|
Emerging |
| 837 |
isomoes/blivedm_rs
一个功能强大的 Bilibili 直播间弹幕 WebSocket 客户端 Rust 库,支持实时弹幕监控、文字转语音(TTS)和浏览器 Cookie... |
|
Emerging |
| 838 |
Quantatirsk/funasr-api
Speech recognition API service powered by FunASR and Qwen-ASR, supporting 52... |
|
Emerging |
| 839 |
voicegain/python-sdk
Python SDK for working with Voicegain Speech-to-Text |
|
Emerging |
| 840 |
slp-rl/aero
This repo contains the official PyTorch implementation of "Audio Super... |
|
Emerging |
| 841 |
roboticslab-uc3m/speech
Text To Speech (TTS) and Automatic Speech Recognition (ASR). |
|
Emerging |
| 842 |
JoelShine/JARVIS-AI-ASSISTANT
A true Artificial Intelligent Assistant with ALICE as backend and offline... |
|
Emerging |
| 843 |
mrf345/django_gtts
Django app extension to add gTTS google text-to-speech |
|
Emerging |
| 844 |
Jaffe2718/Microphone-Text-Input
A fabric mod that can recognize speech as text messages and automatically... |
|
Emerging |
| 845 |
d4n3436/GTranslate
A collection of free translation APIs (Google Translate, Bing Translator,... |
|
Emerging |
| 846 |
fishaudio/docs
Official documentation for products, services, and projects by Fish Audio |
|
Emerging |
| 847 |
AdroitAnandAI/Indian-Accent-Speech-Recognition
Traditional ASR (Signal & Cepstral Analysis, DTW, HMM) & DNNs (Custom Models... |
|
Emerging |
| 848 |
libdriver/ld3320
LD3320 full-featured driver library for general-purpose MCU and Linux. |
|
Emerging |
| 849 |
halfzm/v2vt
video to video translation with voice clone and lip... |
|
Emerging |
| 850 |
undertheseanlp/automatic_speech_recognition
Vietnamese Automatic Speech Recognition |
|
Emerging |
| 851 |
ai-adv-lab/deepspeech.mxnet
A MXNet implementation of Baidu's DeepSpeech architecture |
|
Emerging |
| 852 |
crlandsc/torch-log-wmse
logWMSE, an audio quality metric & loss function with support for digital... |
|
Emerging |
| 853 |
Aivis-Project/aivmlib-web
Aivis Voice Model File (.aivm/.aivmx) Utility Library for Web |
|
Emerging |
| 854 |
lukeewin/FunASR_API
这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech... |
|
Emerging |
| 855 |
Open-Speech-EkStep/vakyansh-wav2vec2-experimentation
Repository containing experimentation platform on how to train, infer on... |
|
Emerging |
| 856 |
eigenpunk/ComfyUI-audio
some generative audio tools for ComfyUI |
|
Emerging |
| 857 |
atomicoo/FCH-TTS
A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese,... |
|
Emerging |
| 858 |
npuichigo/waveglow
A PyTorch implementation of the WaveGlow: A Flow-based Generative Network... |
|
Emerging |
| 859 |
ayutaz/piper-plus
Multilingual neural TTS (6 languages: JA/EN/ZH/ES/FR/PT) with VITS... |
|
Emerging |
| 860 |
deepgram-starters/flask-voice-agent
Flask WebSocket proxy for Deepgram's Voice Agent API |
|
Emerging |
| 861 |
aqiu202/aqiu-spring-boot-starter-projects
个人封装的一些开箱即用的Spring Boot Starter组件,简单且实用,后续会根据需求进行持续扩展! |
|
Emerging |
| 862 |
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model.... |
|
Emerging |
| 863 |
HardCodeDev777/UnityNeuroSpeech
The world’s first game framework that lets you talk to AI in real time —... |
|
Emerging |
| 864 |
ictnlp/StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous... |
|
Emerging |
| 865 |
hasscc/hass-edge-tts
🗣️ Microsoft Edge TTS for Home Assistant, no need for app_key |
|
Emerging |
| 866 |
rioharper/VocalForge
Your one-stop solution for voice dataset creation |
|
Emerging |
| 867 |
CSTR-Edinburgh/magphase
MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications. |
|
Emerging |
| 868 |
rhasspy/piper
A fast, local neural text to speech system |
|
Emerging |
| 869 |
italankin/samplevoicebot
TTS Telegram bot |
|
Emerging |
| 870 |
livingingroups/animal2vec
animal2vec: A self-supervised transformer for rare-event raw audio input |
|
Emerging |
| 871 |
RaduBolbo/F5-TTS-Emotional-CFG
Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class... |
|
Emerging |
| 872 |
baizeteam/baize-toolbox
白泽工具箱,基于electron+ffmpeg实现的一款功能强大的多媒体工具 |
|
Emerging |
| 873 |
Kaljurand/Inimesed
An Android app that lets you search your contacts by voice. Internet not... |
|
Emerging |
| 874 |
KevinMIN95/StyleSpeech
Official implementation of Meta-StyleSpeech and StyleSpeech |
|
Emerging |
| 875 |
MysteryPancake/Discord-TTS
Text to speech Discord bot using FakeYou |
|
Emerging |
| 876 |
chaiyujin/dctts-pytorch
The pytorch implementation of DC-TTS |
|
Emerging |
| 877 |
hash2430/pitchtron
TTS for pitch-accented language. Korean dialect DB. |
|
Emerging |
| 878 |
nipponjo/tts-arabic-pytorch
🎙️ Arabic TTS models (Tacotron2, FastPitch) |
|
Emerging |
| 879 |
yaph/tts-samples
This repository provides text-to-speech (TTS) audio samples in MP3 format... |
|
Emerging |
| 880 |
ccoreilly/LocalSTT
Android Speech Recognition Service using Vosk/Kaldi and Mozilla DeepSpeech |
|
Emerging |
| 881 |
NATSpeech/NATSpeech
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official... |
|
Emerging |
| 882 |
fewieden/MMM-voice
Offline Voice Recognition Module for MagicMirror² |
|
Emerging |
| 883 |
chrisjp/tts
A simple tool to demo text-to-speech using various services' voices. HTML5... |
|
Emerging |
| 884 |
ae9is/subtitle-chan
Live speech transcription and translation in your browser |
|
Emerging |
| 885 |
nnsvs/nnsvs
Neural network-based singing voice synthesis library for research |
|
Emerging |
| 886 |
OAID/cortex-m-kws
Cortex M KWS example with Tengine Lite. |
|
Emerging |
| 887 |
Spac5y/Vocal-Agent
A cutting-edge Cascading voice assistant combining real-time speech... |
|
Emerging |
| 888 |
lucadellalib/focalcodec
A low-bitrate single-codebook 16 / 24 kHz speech codec based on focal modulation |
|
Emerging |
| 889 |
Niger-Volta-LTI/yoruba-text
Yorùbá language training text for NLP, ASR and TTS tasks |
|
Emerging |
| 890 |
Voine/Bert-VITS2-MNN
TTS System Bert-VITS2 Android Ver, powered by alibaba-MNN engine. |
|
Emerging |
| 891 |
ubisoft/ubisoft-laforge-daft-exprt
Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis |
|
Emerging |
| 892 |
daniilrobnikov/vits2
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with... |
|
Emerging |
| 893 |
aeleraqi/Text-to-Speech-gTTS---Arabic-text
Google Text-to-Speech API to convert text input into audio files |
|
Emerging |
| 894 |
georgesterpu/avsr-tf1
Audio-Visual Speech Recognition using Sequence to Sequence Models |
|
Emerging |
| 895 |
see2023/Bert-VITS2-ext
基于Bert-VITS2做的表情、动画测试. Animation testing based on Bert-VITS2. |
|
Emerging |
| 896 |
atomiechen/FunASR-Client
Really easy-to-use Python client for FunASR runtime server. |
|
Emerging |
| 897 |
by2101/OpenASR
A pytorch based end2end speech recognition system. |
|
Emerging |
| 898 |
gfdb/wav2aug
A general purpose task-agnostic speech augmentation policy |
|
Emerging |
| 899 |
rishikksh20/vae_tacotron2
VAE Tacotron 2, an alternative of GST Tacotron |
|
Emerging |
| 900 |
sanchit-gandhi/whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU. |
|
Emerging |