All Voice AI Tools

8,165 tools ranked by quality score · Page 8 of 82

Showing 701–800 of 8,165
# Tool Score Tier
701 keonlee9420/Parallel-Tacotron2

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive...

47
Emerging
702 microsoft/UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

47
Emerging
703 mdangschat/ctc-asr

End-to-end trained speech recognition system, based on RNNs and the...

47
Emerging
704 open-mmlab/Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation....

47
Emerging
705 StephenVinouze/KontinuousSpeechRecognizer

A Kotlin Speech Recognizer that runs continuously and is triggered with an...

47
Emerging
706 stts-se/wikispeech-server

The main API for Wikispeech

47
Emerging
707 deepgram-starters/flask-text-to-speech

Get started using Deepgram's Text-to-Speech with this Flask demo app

47
Emerging
708 bricewalker/Hey-Jetson

Deep Learning based Automatic Speech Recognition with attention for the...

47
Emerging
709 GeekyWizKid/video_processing_service

Video Processing Service is an automated video processing service that...

47
Emerging
710 yy4382/tts-importer

轻松将 Azure TTS 语音合成服务导入阅读软件。现支持阅读(legado)、爱阅记、源阅读。

47
Emerging
711 Umesh-01/Python-Assistant

Python Assistant (PA) is a voice command based assistant service written in...

47
Emerging
712 wildminder/ComfyUI-VoxCPM

ComfyUI node for highly expressive speech and realistic zero-shot voice cloning

47
Emerging
713 analyticsinmotion/werx

🐍📦 Easy-to-use Python package for lightning-fast Word Error Rate (WER) analysis

47
Emerging
714 exPHAT/SwiftWhisper

🎤 The easiest way to transcribe audio in Swift

47
Emerging
715 Kajitsy/Emilia

Emilia - Desktop Character.AI Client

47
Emerging
716 Nikorasu/LiveWhisper

A nearly-live implementation of OpenAI's Whisper, using sounddevice....

47
Emerging
717 DrewThomasson/VoxNovel

VoxNovel: generate audiobooks giving each character a different voice actor.

47
Emerging
718 saidsef/tika-document-to-text

Apache Tika extract text and metadata from any document format with this...

47
Emerging
719 BandarLabs/gitpodcast

Convert any git repository into an engaging podcast

47
Emerging
720 rishikksh20/AdaSpeech

AdaSpeech: Adaptive Text to Speech for Custom Voice

47
Emerging
721 tiberiu44/TTS-Cube

End-2-end speech synthesis with recurrent neural networks

47
Emerging
722 madhavmk/Noise2Noise-audio_denoising_without_clean_training_data

Source code for the paper titled "Speech Denoising without Clean Training...

47
Emerging
723 KyungsuKim42/tokensynth

The official implementation of TokenSynth (ICASSP 2025)

47
Emerging
724 Jackiexiao/zhtts

A demo of zh/Chinese Text to Speech system run on CPU in real time. 中文实时语音合成系统Demo

47
Emerging
725 longluo/EbookReader

The EbookReader Android App. Support file format like epub, pdf, txt, html,...

47
Emerging
726 shhossain/BanglaTTS

BanglaTTS is a text-to-speech (TTS) system for Bangla language that works in...

47
Emerging
727 coqui-ai/STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying...

47
Emerging
728 ycyy/edge-tts-webui

edge-tts webui

47
Emerging
729 voicekit-team/T-one

T-one is a high-performance streaming ASR pipeline for Russian, specialized...

47
Emerging
730 atomicoo/tacotron2-mandarin

Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on...

47
Emerging
731 nari-labs/dia2

TTS model capable of streaming conversational audio in realtime.

47
Emerging
732 mlalma/MisakiSwift

Swift port of Misaki G2P (grapheme-to-phoneme) library that can be used e.g....

47
Emerging
733 gabriele-mastrapasqua/qwen3-tts

Pure C inference engine for Qwen3-TTS text-to-speech. No Python, no PyTorch...

47
Emerging
734 BatuhanYilmaz26/Auto-Subtitled-Video-Generator

Input a YouTube video link or upload a video file and get a video with subtitles.

47
Emerging
735 IBM/MAX-Speech-to-Text-Converter

Converts spoken words into text form.

47
Emerging
736 BogiHsu/Tacotron2-PyTorch

Yet another PyTorch implementation of Tacotron 2 with reduction factor and...

47
Emerging
737 1038lab/ComfyUI-EdgeTTS

ComfyUI-EdgeTTS is a powerful text-to-speech node for ComfyUI, leveraging...

47
Emerging
738 EgorLakomkin/KTSpeechCrawler

Automatically constructing corpus for automatic speech recognition from...

47
Emerging
739 liuli-moe/to-the-stars

魔法少女小圆 飞向星空 中文翻译

47
Emerging
740 shamspias/vibevoice-studio

Beautiful voice app: record or upload to train a voice, generate speech from...

47
Emerging
741 Purple-Horizons/openclaw-voice

🦞 Open-source browser-based voice chat for AI assistants. Self-hosted,...

47
Emerging
742 soobinseo/Tacotron-pytorch

Pytorch implementation of Tacotron

47
Emerging
743 xue-fei/sherpa-onnx-unity

sherpa-onnx in unity

47
Emerging
744 amazon-archives/amazon-polly-sample

Sample application for Amazon Polly. Allows to convert any blog into an...

47
Emerging
745 xenova/whisper-web

ML-powered speech recognition directly in your browser

47
Emerging
746 VidyasagarMSC/WatBot

An Android ChatBot powered by IBM Watson Services (Assistant V1,...

47
Emerging
747 louiskirsch/speechT

An opensource speech-to-text software written in tensorflow

47
Emerging
748 gitmylo/audio-webui

A webui for different audio related Neural Networks

47
Emerging
749 n0name45/node-red-contrib-yandex-station-management

Модуль node-red-contrib-yandex-station-management для управления умными...

47
Emerging
750 themanyone/whisper_dictation

Private voice keyboard, AI chat, images, webcam, recordings, voice control...

47
Emerging
751 jimbozhang/kaldi-gop

Kaldi-based goodness of pronunciation (GOP)

47
Emerging
752 prateekkalra/Selection-js

A lightweight javascipt library which provides users with a set of options...

47
Emerging
753 ArchishmanSengupta/autovoiceevals

A self-improving loop for voice AI agents. Uses karpathy's autoresearch as...

47
Emerging
754 frostming/tetos

A unified interface for multiple Text-to-Speech (TTS) providers.

47
Emerging
755 supershaneski/openai-whisper-talk

openai-whisper-talk is a sample voice conversation application powered by...

47
Emerging
756 Kyubyong/tacotron_asr

Speech Recognition Using Tacotron

47
Emerging
757 embium/solverecaptchas

An async Python library to automate solving ReCAPTCHA v2 using Playwright.

47
Emerging
758 linto-ai/linto-studio

Transcription and annotation interface for recorded audio or video files

47
Emerging
759 ivcylc/OpenMusic

OpenMusic: SOTA Text-to-music (TTM) Generation

47
Emerging
760 modelscope/ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained...

47
Emerging
761 joelpurra/talkie

Text-to-speech browser extension button. Select text on any web page, and...

46
Emerging
762 coqui-ai/open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

46
Emerging
763 ImNimboss/uberduck

A synchronous and asynchronous API wrapper for the UberDuck text-to-speech...

46
Emerging
764 Open-Speech-EkStep/vakyansh-models

Open source speech to text models for Indic Languages

46
Emerging
765 goxr3plus/java-google-speech-api

🙊 Speech Recognition , Text To Speech , Google Translate

46
Emerging
766 harmlessman/PAFTS

PAFTS : Library That Preprocessing Audio For TTS.

46
Emerging
767 sipeed/Maix-Speech

Maix Speech AI lib, a fast and small speech lib running on embedded devices,...

46
Emerging
768 deterministic-algorithms-lab/Cross-Lingual-Voice-Cloning

Tacotron 2 - PyTorch implementation with faster-than-realtime inference...

46
Emerging
769 maxwellobi/Android-Speech-Recognition

Continuous speech recognition library for Android with options to use...

46
Emerging
770 phatjkk/SpeakIt_Vietnamese_TTS

Vietnamese Text-to-Speech on Windows Project (zalo-speech)

46
Emerging
771 mark-rez/TikTok-Voice-TTS

Simple Python script to interact with the TikTok TTS Voices.

46
Emerging
772 DePasqualeOrg/mlx-swift-audio

Swift tools for text to speech (TTS) and speech to text (STT) powered by MLX

46
Emerging
773 HumeAI/hume-react-sdk

Packages for using Hume AI and React

46
Emerging
774 keonlee9420/Expressive-FastSpeech2

PyTorch Implementation of Non-autoregressive Expressive (emotional,...

46
Emerging
775 gooofy/py-picotts

Python wrappers around SVOX Pico TTS

46
Emerging
776 ORI-Muchim/Efficient-Speech

Lightweight Korean TTS Model based on FastSpeech2

46
Emerging
777 SARIT42/lipsyncr

LipSyncr is a lip reading web app based on the LipNet model that can lip...

46
Emerging
778 smeetrs/deep_avsr

A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.

46
Emerging
779 themanyone/voice_typing

State-of-the-art offline (or networked) voice typing everywhere + text...

46
Emerging
780 cvqluu/simple_diarizer

Simplified diarization pipeline using some pretrained models - audio file to...

46
Emerging
781 revdotcom/fstalign

An efficient OpenFST-based tool for calculating WER and aligning two...

46
Emerging
782 d4n3436/Fergun

A utility Discord bot written in C# using Discord.Net

46
Emerging
783 kkoutini/PaSST

Efficient Training of Audio Transformers with Patchout

46
Emerging
784 IhorShevchuk/RHVoice-spm

A free and open source speech synthesizer with support for a lot languages...

46
Emerging
785 eel-brah/kokorodoki

Natural-sounding Text-to-Speech App that fits anywhere. Fast, Real-Time and flexible.

46
Emerging
786 alexram1313/text-to-speech-sample

Python3 Text to Speech Video Sample

46
Emerging
787 freewym/espresso

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

46
Emerging
788 yl4579/StyleTTS

Official Implementation of StyleTTS

46
Emerging
789 microsoft/SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

46
Emerging
790 funcwj/aps

A personal toolkit for single/multi-channel speech recognition & enhancement...

46
Emerging
791 acoti/articulate.js

A jQuery plugin that lets the browser speak to you.

46
Emerging
792 hans00/phonemize

Pure JS fast phonemizer with rule-based G2P prediction

46
Emerging
793 cpfair/quran-align

Word-accurate timestamps for Qur'anic audio.

46
Emerging
794 sl5net/SL5-aura-service

Your offline, privacy-first voice assistant framework. Transform speech into...

46
Emerging
795 VideotronicMaker/LM-Studio-Voice-Conversation

Python app for LM Studio-enhanced voice conversations with local LLMs. Uses...

46
Emerging
796 overcrash66/OpenTranslator

Open Translator: Speech To Speech and Speech to text Translator with voice...

46
Emerging
797 TheNewC0der-24/Textonus

Voice to Text Online Notepad Professional, Accurate & Free Speech...

46
Emerging
798 arihanv/Shush

Shush is an app that deploys a WhisperV3 model with Flash Attention v2 on...

46
Emerging
799 jeroenterheerdt/pycsspeechtts

Python (py) library to use Microsofts Cognitive Services Speech (csspeech)...

46
Emerging
800 FireRedTeam/FireRedTTS

An Open-Sourced LLM-empowered Foundation TTS System

46
Emerging
« Prev 1 2 3 6 7 8 9 10 80 81 82 Next »