All Voice AI Tools

8,165 tools ranked by quality score · Page 10 of 82

Showing 901–1000 of 8,165
# Tool Score Tier
901 Kaljurand/speechutils

Android library for speech-to-text and text-to-speech apps

45
Emerging
902 vineeths96/Spoken-Keyword-Spotting

In this repository, we explore using a hybrid system consisting of a...

45
Emerging
903 AlexandaJerry/whisper-vits-japanese

Vits Japanese with Whisper as data processor (you can train your VITS even...

45
Emerging
904 baidubce/pie

百度云流式语音识别客户端 SDK

45
Emerging
905 GetcharZp/go-speech

go-speech 基于 Golang + ONNX 构建的轻量语音库,支持 TTS(文本转语音)与 ASR(语音转文字)。已集成...

45
Emerging
906 finchvox/finchvox

Voice AI Observability, Elevated

45
Emerging
907 Oknolaz/vasisualy

Vasisualy it's a simple Russian-language voice assistant written on Python...

45
Emerging
908 szimek/webrtc-translate

Highly experimental (read: "barely working") app that uses WebRTC API and...

45
Emerging
909 jaco-bro/diajax

Dia-JAX: A JAX port of Dia, the text-to-speech model for generating...

45
Emerging
910 Elleo/pied

Pied makes it simple to install and manage text-to-speech Piper voices for...

45
Emerging
911 skit-ai/kaldi-serve

Server framework for Kaldi ASR Toolkit

45
Emerging
912 juliuskunze/speechless

Speech-to-text based on wav2letter built for transfer learning

45
Emerging
913 HAKORADev/VODER

Voice Operation and Design Engine with Reproduction capabilities

45
Emerging
914 pth2000/PowerPointReviewer

一个基于PySide6实现的演讲稿朗读审阅工具,使用TTS引擎朗读PPT中的备注部分,从而辅助您进一步完善演讲的内容与措辞,助您顺利完成精彩的PPT演讲与展示。

45
Emerging
915 fikrikarim/parlor

On-device, real-time multimodal AI. Have natural voice and vision...

45
Emerging
916 Kyubyong/cross_vc

Cross-lingual Voice Conversion

45
Emerging
917 danthelion/doc2audiobook

Convert text documents to high fidelity audio(books).

45
Emerging
918 pofice/voice-input-method

AI native的跨平台离线语音输入法

45
Emerging
919 alexiokay/AriLink

Modern ARI-STASI server, built on Asterisk ARI with real-time speech-to-text...

45
Emerging
920 thewh1teagle/phonikud-tts

phonikud-tts - text to speech in Hebrew

45
Emerging
921 inevolin/DiscordSpeechBot

A speech-to-text bot for discord with music commands and more using NodeJS....

45
Emerging
922 silversparro/wav2letter.pytorch

A fully convolution-network for speech-to-text, built on pytorch.

45
Emerging
923 awsaf49/audio_classification_models

Tensorflow Audio Classification Models

45
Emerging
924 Camb-ai/MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

45
Emerging
925 AIGC-Audio/AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

45
Emerging
926 RapidAI/RapidASR

📣 商用级开源语音自动识别程序库,开箱即用,全平台支持,中英文混合识别。A Cross-platform implementation of ASR...

45
Emerging
927 Picovoice/falcon

On-device speaker diarization powered by deep learning

45
Emerging
928 Cay-Zhang/SwiftSpeech

A speech recognition framework designed for SwiftUI.

45
Emerging
929 IBM/speech-to-text-code-pattern

WARNING: This repository is no longer maintained

45
Emerging
930 spring-media/DeepPhonemizer

Grapheme to phoneme conversion with deep learning.

45
Emerging
931 voicetestdev/voicetest

Test harness for voice agents. Import from Retell, VAPI, Bland, LiveKit. Run...

45
Emerging
932 gsssrao/UnityAndroidSpeechRecognition

This repository is a Unity plugin for Android Speech Recognition (based on...

45
Emerging
933 rhasspy/rhasspy

Offline private voice assistant for many human languages

45
Emerging
934 deepgram-starters/django-transcription

Get started using Deepgram's Transcription with this Django demo app

45
Emerging
935 Tinkoff/voicekit-examples

Examples on how to use Tinkoff Voicekit

45
Emerging
936 huggingface/distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller,...

45
Emerging
937 theblackcat102/edgedict

Working online speech recognition based on RNN Transducer. ( Trained model...

45
Emerging
938 maum-ai/univnet

Unofficial PyTorch Implementation of UnivNet Vocoder...

45
Emerging
939 synesthesiam/rhasspy

Rhasspy voice assistant for offline home automation

45
Emerging
940 Onuronon-lab/Shrutik

Open-source voice data collection platform for building inclusive voice...

45
Emerging
941 ibotplus/kbase-media

视频、音频、图片内容识别、语音转写、语音合成 / easy convert video audio image to text, and revert...

45
Emerging
942 daanzu/deepspeech-websocket-server

Server & client for DeepSpeech using WebSockets for real-time speech...

45
Emerging
943 aman179102/podvoice

Local-first CLI that turns Markdown scripts into multi-speaker podcast-style...

45
Emerging
944 arghyasur1991/LiveTalk-Unity

LiveTalk is a unified, high-performance talking head generation system that...

45
Emerging
945 yukukotani/pi-voice

Headless voice interface for the Pi Coding Agent

45
Emerging
946 zhao-kun/VibeVoiceFusion

VibeVoiceFusion is a full-stack, multi-speaker voice generation web system...

45
Emerging
947 WindQAQ/listen-attend-and-spell

Tensorflow implementation of "Listen, Attend and Spell" authored by William...

45
Emerging
948 cool-japan/voirs

VoiRS is a cutting-edge Text-to-Speech (TTS), Voice Recognition, Sound...

45
Emerging
949 thewh1teagle/piper-rs

Use piper TTS models in Rust

45
Emerging
950 benjaminwan/ChineseTtsTflite

Android Chinese TTS Engine Base On Tensorflow TTS , use for TfLite Models...

45
Emerging
951 simonw/ospeak

CLI tool for running text through OpenAI Text to speech

45
Emerging
952 NickZaitsev/ru-normalizr

ru-normalizr — лучший open-source нормализатор русского текста. Приводит...

45
Emerging
953 mush42/sonata-nvda

This add-on implements a speech synthesizer driver for NVDA using neural TTS...

45
Emerging
954 rishikksh20/Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

45
Emerging
955 tarepan/VoiceConversionLab

Collect Voice Conversion researches

45
Emerging
956 rohit-lakhanpal/ai-hackathon-starter-kit

This project has been created to make AI accessible and easy for everyone....

45
Emerging
957 keenresearch/KeenASR-Android-PoC

A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING:...

45
Emerging
958 Amirrezahmi/SelfTalker

Engage in conversation with your virtual self using AI techniques like NLP,...

45
Emerging
959 GuillaumeFalourd/formulas-python

Ritchie CLI formulas in Python 🐍

45
Emerging
960 dictate-button/dictate-button

Customizable Web Component that adds speech-to-text dictation capabilities...

45
Emerging
961 nl8590687/ASRT_SDK_WinClient

An Windows client SDK and Demo software for ASRT speech recognition system....

45
Emerging
962 trldvix/youtube-transcript-api

Java library which allows you to retrieve subtitles/transcripts for a single...

45
Emerging
963 Edresson/YourTTS

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion...

45
Emerging
964 chenyme/Chenyme-AAVT

这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。

45
Emerging
965 earlephilhower/BackgroundAudio

Arduino library for easy, interrupt driven speech, MP3, AAC, and WAV...

44
Emerging
966 FelixWaweru/elevenlabs-node

Eleven Labs text to speech package for NodeJS. You can use the official...

44
Emerging
967 markomijic/TTS-Mod-Vault

Cross-platform Tabletop Simulator mod backup & download tool — the modern...

44
Emerging
968 zslrmhb/Omniverse-Virtual-Assisstant

Audio2Face Avatar with Riva SDK functionality

44
Emerging
969 puntorigen/podcast_tts

A class for generating realistic audio (TTS) for podcasts and dialogues.

44
Emerging
970 notAI-tech/IndicASR

Speeech Recognition for Indic languages.

44
Emerging
971 blip-radar/vatsim-parser

Parser for a variety of VATSIM-related file formats

44
Emerging
972 HeyHeyChicken/NOVA-NodeJS

NOVA is a customizable voice assistant made with Node.js.

44
Emerging
973 zw76859420/ASR_WORD

采用端到端方法构建声学模型,以字为建模单元,采用DCNN-CTC网络结构。

44
Emerging
974 NotAbhinavGamerz/emotion-aware-automatic-speech-recognition

🎤 Enhance speech recognition by detecting emotions in spoken language,...

44
Emerging
975 neosun100/cosyvoice-docker

🎙️ CosyVoice All-in-One Docker - Production-ready TTS with Web UI, REST API...

44
Emerging
976 snakers4/open_stt

Open STT

44
Emerging
977 Rubiksman78/MonikA.I

Submod for MAS with AI based features

44
Emerging
978 tarun7r/SpeechAlgo

A Comprehensive Speech Processing Algorithms Library for research and production use

44
Emerging
979 aofdev/vue-pwa-speech

A Vue2 Performs synchronous speech recognition Speech to text Google Cloud...

44
Emerging
980 coqui-ai/TTS-papers

🐸 collection of TTS papers

44
Emerging
981 aofdev/vue-speech-streaming

A Vue2 Streaming Speech Recognition Speech to text with Google Cloud Speech

44
Emerging
982 gurjar1/OmniDictate

Free, open-source, real-time dictation for Windows. Runs locally (no...

44
Emerging
983 inclusionAI/Ming-UniAudio

Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing...

44
Emerging
984 persephone-tools/persephone

A tool for automatic phoneme transcription

44
Emerging
985 r9y9/ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

44
Emerging
986 SEPIA-Framework/sepia-stt-server

SEPIA server to support open-source speech recognition via WebSocket connection.

44
Emerging
987 nodef/extra-googletts

Generate speech audio from super long text through machine (via "Google...

44
Emerging
988 jishengpeng/WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second...

44
Emerging
989 dunky11/voicesmith

[WIP] VoiceSmith makes training text to speech models easy.

44
Emerging
990 sq2ips/sr0wx

Unowocześniony projekt automatycznej radioamatorskiej stacji pogodowej sr0wx

44
Emerging
991 Ashish-Patnaik/kokoclone

Voice Cloning, Now Inside Kokoro. Generate natural multilingual speech and...

44
Emerging
992 coqui-ai/TTS-recipes

🐸TTS recipes for different datasets

44
Emerging
993 sljeff/anycast

An AI-Powered Podcast App.

44
Emerging
994 LSimon95/megatts2

Unoffical implementation of Megatts2

44
Emerging
995 Pictalk-speech-made-easy/pictalk-frontend

Pictalk is an open-source application designed to assist individuals with...

44
Emerging
996 mozhou-tech/kim-voice-assistant

Kim,your personal voice kit for Home Inteligence.

44
Emerging
997 shijincai/VibeVoice

Archive of the official Microsoft VibeVoice repository (7B & 1.5B). Backup...

44
Emerging
998 yrom/finetune-index-tts

IndexTTS Fine-tuning notebooks

44
Emerging
999 keonlee9420/Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family...

44
Emerging
1000 oddlama/whisper-overlay

A wayland overlay providing speech-to-text functionality for any application...

44
Emerging
« Prev 1 2 3 8 9 10 11 12 80 81 82 Next »