All Voice AI Tools
8,165 tools ranked by quality score · Page 7 of 82
| # | Tool | Score | Tier |
|---|---|---|---|
| 601 |
gooofy/py-espeak-ng
Some simple wrappers around eSpeak NG intended to make using this excellent... |
|
Emerging |
| 602 |
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support... |
|
Emerging |
| 603 |
artcore-c/AI-Voice-Clone-with-Coqui-XTTS-v2
Free voice cloning for creators using Coqui XTTS-v2 on Google Colab. Clone... |
|
Emerging |
| 604 |
solyarisoftware/voskJs
Vosk ASR offline engine API for NodeJs developers. With a simple HTTP ASR server. |
|
Emerging |
| 605 |
gooofy/zerovox
zero-shot realtime TTS system, fully offline, free and open source |
|
Emerging |
| 606 |
PraaneshSelvaraj/speech_engine
Speech Engine is a Python package that provides a simple interface for... |
|
Emerging |
| 607 |
andresayac/edge-tts
Edge TTS is a Node or Bun package that allows access to the online... |
|
Emerging |
| 608 |
lucasjinreal/Kokoros
🔥🔥 Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast,... |
|
Emerging |
| 609 |
devnen/Dia-TTS-Server
Self-host the powerful Dia TTS model. This server offers a user-friendly Web... |
|
Emerging |
| 610 |
hehehai/voxt
🎙️Voice input and translation app for macOS. Press to talk, release to paste. |
|
Emerging |
| 611 |
lucasnewman/best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech... |
|
Emerging |
| 612 |
Kaljurand/dictate.js
A small Javascript library for browser-based real-time speech recognition,... |
|
Emerging |
| 613 |
filippogiruzzi/voice_activity_detection
Voice Activity Detection based on Deep Learning & TensorFlow |
|
Emerging |
| 614 |
KinglittleQ/GST-Tacotron
A PyTorch implementation of Style Tokens: Unsupervised Style Modeling,... |
|
Emerging |
| 615 |
abhirooptalasila/AutoSub
A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using... |
|
Emerging |
| 616 |
jpuigcerver/Laia
Laia: A deep learning toolkit for HTR based on Torch |
|
Emerging |
| 617 |
Mag1cFall/AIStudio2API
将AI Studio反代成OpenAI兼容的API | OpenAI-compatible API proxy for Google AI Studio |
|
Emerging |
| 618 |
feldberlin/timething
Timething is a library for aligning text transcripts with their audio recordings. |
|
Emerging |
| 619 |
fulldecent/vowel-practice
iOS application for finding formants in spoken sounds |
|
Emerging |
| 620 |
BoltzmannEntropy/xtts2-ui
A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech |
|
Emerging |
| 621 |
Atm4x/tts-with-rvc
TTS with RVC-module to generate .wav audios |
|
Emerging |
| 622 |
gooofy/zamia-speech
Open tools and data for cloudless automatic speech recognition |
|
Emerging |
| 623 |
pulijon/Sttcast
Transcription from mp3 files to html with or without embedded player |
|
Emerging |
| 624 |
shashank2122/Local-Voice
A real-time, offline voice assistant for Linux and Raspberry Pi. Uses local... |
|
Emerging |
| 625 |
MerlinCN/kinoko7danmaku
调用TTS来播报哔哩哔哩直播中的弹幕、礼物、舰长等 |
|
Emerging |
| 626 |
yl4579/AuxiliaryASR
Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment) |
|
Emerging |
| 627 |
blaisewf/rvc-cli
🚀 RVC + UVR = A perfect set of tools for voice cloning, easily and free! |
|
Emerging |
| 628 |
cboard-org/ccboard
Cordova wrapper for the Cboard application |
|
Emerging |
| 629 |
thepirat000/spleeter-api
Audio separation API using Spleeter from Deezer |
|
Emerging |
| 630 |
ivanvovk/durian-pytorch
Implementation of "Duration Informed Attention Network for Multimodal... |
|
Emerging |
| 631 |
mediatechlab/tts-wrapper
TTS-Wrapper makes it easier to use text-to-speech APIs by providing a... |
|
Emerging |
| 632 |
supertone-inc/supertonic-py
Lightning-Fast, On-Device TTS — running natively via ONNX. |
|
Emerging |
| 633 |
dngda/bot-whatsapp
Unmaintained - Multipurpose WhatsApp Bot 🤖 using open-wa/wa-automate-nodejs... |
|
Emerging |
| 634 |
mozilla/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion... |
|
Emerging |
| 635 |
HiMeditator/auto-caption
A cross-platform real-time subtitle display software. 一个跨平台的实时字幕显示软件。 |
|
Emerging |
| 636 |
OvidijusParsiunas/speech-to-element
A simple way to add speech to text functionality to your website :microphone: |
|
Emerging |
| 637 |
haolinwang819-boop/ai-video-generation-workflow
AI video generation workflow with script, slides, TTS, subtitles, and FFmpeg... |
|
Emerging |
| 638 |
XiaoMi/kaldi-onnx
Kaldi model converter to ONNX |
|
Emerging |
| 639 |
jxzhanggg/nonparaSeq2seqVC_code
Implementation code of non-parallel sequence-to-sequence VC |
|
Emerging |
| 640 |
SuyashMore/MevonAI-Speech-Emotion-Recognition
Identify the emotion of multiple speakers in an Audio Segment |
|
Emerging |
| 641 |
jbelford/Eolian
Eolian is a Discord music bot which provide a very powerful API for queuing... |
|
Emerging |
| 642 |
harry0703/AudioNotes
快速提取音视频内容,整理成一份结构化的markdown笔记 |
|
Emerging |
| 643 |
gentaiscool/end2end-asr-pytorch
End-to-End Automatic Speech Recognition on PyTorch |
|
Emerging |
| 644 |
haoheliu/voicefixer_main
General Speech Restoration |
|
Emerging |
| 645 |
tsurumeso/vocal-remover
Vocal Remover using Deep Neural Networks |
|
Emerging |
| 646 |
upskyy/Squeezeformer
PyTorch implementation of "Squeezeformer: An Efficient Transformer for... |
|
Emerging |
| 647 |
PlayVoice/vits_chinese
Best practice TTS based on BERT and VITS with some Natural Speech Features... |
|
Emerging |
| 648 |
just-ai/aimybox-android-assistant
Embeddable custom voice assistant for Android applications |
|
Emerging |
| 649 |
scionoftech/DeepAsr
Keras(Tensorflow) implementations of Automatic Speech Recognition |
|
Emerging |
| 650 |
TUD-STKS/VocalTractLabBackend-dev
The VocalTractLab backend sources and C/C++ API |
|
Emerging |
| 651 |
AutoArk/GPA
[AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion... |
|
Emerging |
| 652 |
KKshitiz/J.A.R.V.I.S
Iron man inspired Personal virtual assistant |
|
Emerging |
| 653 |
Picovoice/cobra
On-device voice activity detection (VAD) powered by deep learning |
|
Emerging |
| 654 |
alesaccoia/VoiceStreamAI
Near-Realtime audio transcription using self-hosted Whisper and WebSocket in... |
|
Emerging |
| 655 |
rolczynski/Automatic-Speech-Recognition
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow) |
|
Emerging |
| 656 |
Aivis-Project/AivisSpeech
AivisSpeech: AI Voice Imitation System - Text to Speech Software |
|
Emerging |
| 657 |
gexgd0419/NaturalVoiceSAPIAdapter
Make Azure natural TTS voices accessible to any SAPI 5-compatible application. |
|
Emerging |
| 658 |
alumae/kaldi-offline-transcriber
Offline transcription system for Estonian using Kaldi |
|
Emerging |
| 659 |
gitmylo/bark-voice-cloning-HuBERT-quantizer
The code for the bark-voicecloning model. Training and inference. |
|
Emerging |
| 660 |
rishikksh20/FastSpeech2
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End... |
|
Emerging |
| 661 |
hegedustibor/htgo-tts
Text to speech package for Golang. |
|
Emerging |
| 662 |
YaoFANGUK/video-subtitle-extractor
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI... |
|
Emerging |
| 663 |
clovaai/ClovaCall
ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020) |
|
Emerging |
| 664 |
supersu-man/pyt2s
The Python Text to Speech library you've been looking for. |
|
Emerging |
| 665 |
WanderingAstronomer/Vociferous
Vociferous captures audio from your microphone, transcribes it in real-time... |
|
Emerging |
| 666 |
yl4579/PL-BERT
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions |
|
Emerging |
| 667 |
philipperemy/tensorflow-ctc-speech-recognition
Application of Connectionist Temporal Classification (CTC) for Speech... |
|
Emerging |
| 668 |
Jaymon/transcribe
Convert images or audio files to plain text on the command line |
|
Emerging |
| 669 |
keenresearch/keenasr-ios-poc
Proof of concept app that demonstrates use of KeenASR SDK in ObjC. WE ARE... |
|
Emerging |
| 670 |
reriiasu/speech-to-text
Real-time transcription using faster-whisper |
|
Emerging |
| 671 |
openspeech-team/openspeech
Open-Source Toolkit for End-to-End Speech Recognition leveraging... |
|
Emerging |
| 672 |
moshehbenavraham/Voice-Agent-PuPuPlatter
Multi-provider voice AI showcase featuring 7 providers (ElevenLabs + Widget,... |
|
Emerging |
| 673 |
EnjiRouz/Voice-Assistant-App
Python Voice Assistant project can: recognize and synthesize speech without... |
|
Emerging |
| 674 |
alphacep/vosk-asterisk
Speech Recognition in Asterisk with Vosk Server |
|
Emerging |
| 675 |
itsRares/react-native-deepgram
Brings Deepgram's capabilities to React Native applications, with a focus on... |
|
Emerging |
| 676 |
LEEYOONHYUNG/BVAE-TTS
Official implementation of BVAE-TTS |
|
Emerging |
| 677 |
xkeyC/fl_caption
Offline real-time captioning software written in Flutter and Rust, powered... |
|
Emerging |
| 678 |
rishikksh20/VocGAN
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested... |
|
Emerging |
| 679 |
jiaqili3/DualCodec
[Interspeech 2025] DualCodec: A Low-Frame-Rate, Semantically-Enhanced Neural... |
|
Emerging |
| 680 |
yl4579/PitchExtractor
Deep Neural Pitch Extractor for Voice Conversion and TTS Training |
|
Emerging |
| 681 |
lmnt-com/wavegrad
A fast, high-quality neural vocoder. |
|
Emerging |
| 682 |
seanghay/speechviewer
A quick audio dataset viewer |
|
Emerging |
| 683 |
chenliangrui/EasyMrcp
欢迎使用EasyMrcp! EasyMrcp使用java编写,目前提供了多种不同的asr和tts的集成,做到真正简单使用ASR和TTS。... |
|
Emerging |
| 684 |
Deepest-Project/MelNet
Implementation of "MelNet: A Generative Model for Audio in the Frequency Domain" |
|
Emerging |
| 685 |
elimu-ai/vitabu
📚 Android application for reading storybooks and expanding word vocabulary. |
|
Emerging |
| 686 |
YoavRamon/awesome-kaldi
This is a list of features, scripts, blogs and resources for better using... |
|
Emerging |
| 687 |
IS2AI/Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech... |
|
Emerging |
| 688 |
agent87/RW-DEEPSPEECH-API
An end to end deep speech REST API containing speech to text and text speech... |
|
Emerging |
| 689 |
alexruperez/SpeechRecognizerButton
UIButton subclass with push to talk recording, speech recognition and... |
|
Emerging |
| 690 |
saiteja-talluri/Speech2Face
Implementation of the CVPR 2019 Paper - Speech2Face: Learning the Face... |
|
Emerging |
| 691 |
modelscope/KAN-TTS
KAN-TTS is a speech-synthesis training framework, please try the demos we... |
|
Emerging |
| 692 |
symblai/speech-recognition-evaluation
Evaluate results from ASR/Speech-to-Text quickly |
|
Emerging |
| 693 |
Gautham495/react-native-speech-recognition-kit
React Native Turbo Module to access Speech Recognition in Android & iOS |
|
Emerging |
| 694 |
AppDevGuy/OSSSpeechKit
OSSSpeechKit offers a native iOS Speech wrapper for AVFoundation and Apple's Speech. |
|
Emerging |
| 695 |
cvqluu/Factorized-TDNN
PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal... |
|
Emerging |
| 696 |
bensonruan/Chrome-Web-Speech-API
Chrome Web Speech API |
|
Emerging |
| 697 |
dspavankumar/keras-kaldi
Keras Interface for Kaldi ASR |
|
Emerging |
| 698 |
travisvn/obsidian-edge-tts
Free, high quality text-to-speech for your Obsidian notes, leveraging... |
|
Emerging |
| 699 |
roatienza/efficientspeech
PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023. |
|
Emerging |
| 700 |
512z/podlens
Free Podwise: AI Podcast & Youtube Transcription & Understanding Agent |... |
|
Emerging |