All Voice AI Tools

8,165 tools ranked by quality score · Page 5 of 82

Showing 401–500 of 8,165
# Tool Score Tier
401 Devansh-47/Sign-Language-To-Text-and-Speech-Conversion

This is a python application which converts american sign language into text...

51
Established
402 gokhaneraslan/chatterbox-finetuning

Fine-tuning toolkit for Chatterbox TTS & Chatterbox TURBO models. Supports...

51
Established
403 hirofumi0810/neural_sp

End-to-end ASR/LM implementation with PyTorch

51
Established
404 pnnbao97/sea-g2p

Fast multilingual text-to-phoneme converter for South East Asian languages.

51
Established
405 aws-samples/amazon-transcribe-live-call-analytics

Amazon Transcribe Live Call Analytics (LCA) Sample Solution

51
Established
406 PhamHuynhAnh16/Vietnamese-RVC

Dự án công cụ chuyển đổi giọng nói dành cho người Việt

51
Established
407 revdotcom/revai-python-sdk

Rev AI Python SDK

51
Established
408 google/voice-builder

An opensource text-to-speech (TTS) voice building tool

51
Established
409 rse/speechflow

Speech Processing Flow Graph

51
Established
410 DemisEom/SpecAugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

51
Established
411 Saik0s/Whisperboard

The open-source iOS app that's making quality voice transcription more...

51
Established
412 gotev/android-speech

Android speech recognition and text to speech made easy

51
Established
413 mutablelogic/go-whisper

Speech-to-Text in golang

51
Established
414 233stone/vocotype-cli

VocoType 是一款运行在本地端侧的隐私安全语音输入工具,通过快捷键即可将语音实时转换为文字并自动输入到当前应用。支持语音转文字MCP、AI...

51
Established
415 sooftware/kospeech

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition...

51
Established
416 noahchalifour/rnnt-speech-recognition

End-to-end speech recognition using RNN Transducers in Tensorflow 2.0

51
Established
417 Rayhane-mamah/Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation

51
Established
418 descriptinc/melgan-neurips

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis

51
Established
419 Kaljurand/K6nele

An Android app that offers speech-to-text user interfaces to other apps

51
Established
420 googleapis/nodejs-speech

This repository is deprecated. All of its content and history has been moved...

51
Established
421 lkuza2/java-speech-api

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using...

51
Established
422 ggeop/Python-ai-assistant

Python AI assistant 🧠

51
Established
423 thinhlpg/vixtts-demo

A Vietnamese Voice Cloning Text-to-Speech Model ✨

51
Established
424 alumae/kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit...

51
Established
425 jcsilva/docker-kaldi-gstreamer-server

Dockerfile for kaldi-gstreamer-server.

51
Established
426 zw76859420/ASR_Theory

语音识别理论、论文和PPT

51
Established
427 speechio/chinese_text_normalization

Chinese text normalization for speech processing

51
Established
428 daswer123/xtts-webui

Webui for using XTTS and for finetuning it

51
Established
429 cosin2077/easyVoice

开源文本转语音工具,支持超长文本,多角色配音

51
Established
430 i4Ds/whisper-finetune

This repository contains code for fine-tuning the Whisper speech-to-text model.

51
Established
431 Kyubyong/tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech...

51
Established
432 lifeiteng/vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo...

51
Established
433 philipperemy/deep-speaker

Deep Speaker: an End-to-End Neural Speaker Embedding System.

51
Established
434 Azure-Samples/SpeechToText-WebSockets-Javascript

SDK & Sample to do speech recognition using websockets in Javascript

51
Established
435 botbahlul/autosrt

A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using...

51
Established
436 jik876/hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity...

51
Established
437 rxlabz/speech_recognition

A Flutter plugin to use speech recognition on iOS & Android (Swift/Java)

51
Established
438 fatchord/WaveRNN

WaveRNN Vocoder + TTS

51
Established
439 riderodd/react-native-vosk

Speech recognition module for react native using Vosk library

51
Established
440 r9y9/deepvoice3_pytorch

PyTorch implementation of convolutional neural networks-based text-to-speech...

51
Established
441 symblai/getting-started-samples

Code samples to Get started quickly with Symbl's Voice SDK and APIs:...

51
Established
442 xcmyz/FastSpeech

The Implementation of FastSpeech based on pytorch.

51
Established
443 Amey-Thakur/DEEPFAKE-AUDIO

🎙️ Deepfake Audio – A neural voice cloning studio powered by SV2TTS technology.

51
Established
444 kadirnar/VoiceHub

VoiceHub: A Unified Inference Interface for TTS Models

51
Established
445 mewmix/nabu

A multi engine TTS & LLM edge computing playground with audio book features...

50
Established
446 lovelyterry/SmartSpeaker

一个基于云端语音识别的智能控制设备,类似于天猫精灵,小爱同学。采用的芯片为stm32f407,wm8978,esp8266。

50
Established
447 alumae/gst-kaldi-nnet2-online

GStreamer plugin around Kaldi's online neural network decoder

50
Established
448 Jackiexiao/MTTS

A Demo of Mandarin/Chinese TTS frontend

50
Established
449 petermg/Chatterbox-TTS-Extended

Modified version of Chatterbox that accepts text files as input and no...

50
Established
450 jackaduma/CycleGAN-VC2

Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2

50
Established
451 robinhad/ukrainian-tts

Ukrainian TTS (text-to-speech) using ESPNET

50
Established
452 j3soon/whisper-to-input

An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI...

50
Established
453 antor44/livestream_video

playlist4whisper manages media streams playlists for livestream_video.sh,...

50
Established
454 AlexandaJerry/vits-mandarin-biaobei

application of vits on mandarin tts

50
Established
455 ZDisket/TensorVox

Desktop application for neural speech synthesis written in C++

50
Established
456 byjlw/video-analyzer

Analyze videos using LLMs, Computer Vision and Automatic Speech Recognition

50
Established
457 ai-bot-pro/achatbot

An open source chat bot architecture for voice/vision (and multimodal)...

50
Established
458 cboard-org/cboard-api

Cboard API provides backend functionality and persistence to the Cboard application

50
Established
459 aahl/qwen-asr2api

🎤 Qwen 3 ASR to OpenAI API, 免费STT语音识别模型

50
Established
460 AI-Manga-Readers/AI_Manga_Reader

AI Manga Reader is a next-gen manga app powered by the MangaDex API,...

50
Established
461 woheller69/whoBIRD

Identify bird sounds in real time with this Android version of BirdNET. Bird...

50
Established
462 isaiahbjork/orpheus-tts-local

Run Orpheus 3B Locally With LM Studio

50
Established
463 YuanGongND/ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

50
Established
464 YuanGongND/whisper-at

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT:...

50
Established
465 arjo129/uSpeech

Speech recognition toolkit for the arduino

50
Established
466 Tomiinek/Multilingual_Text_to_Speech

An implementation of Tacotron 2 that supports multilingual experiments with...

50
Established
467 gpustack/vox-box

A text-to-speech and speech-to-text server compatible with the OpenAI API,...

50
Established
468 inevolin/DiscordEarsBot

A speech-to-text framework and bot for Discord. Take control of your Discord...

50
Established
469 FlashLabs-AI-Corp/FlashLabs-Chroma

Worlds first open-source real-time end-to-end spoken dialogue model with...

50
Established
470 jaywalnut310/vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for...

50
Established
471 dmisol/flexatar-virtual-webcam

Personalized Virtual Webcam for WebRTC

50
Established
472 rendchevi/nix-tts

🐤 Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation

50
Established
473 mastashake08/speech-kit

Simplifying the Speech Synthesis and Speech Recognition engines for...

50
Established
474 common-voice/cv-dataset

Metadata and versioning details for the Common Voice dataset

50
Established
475 huihut/Facemoji

😆 A voice chatbot that can imitate your expression....

50
Established
476 hirofumi0810/tensorflow_end2end_speech_recognition

End-to-End speech recognition implementation base on TensorFlow (CTC,...

50
Established
477 wildminder/ComfyUI-VibeVoice

ComfyUI custom node for the VibeVoice TTS. Expressive, long-form,...

50
Established
478 pbakondy/cordova-plugin-speechrecognition

:microphone: Cordova Plugin for Speech Recognition

50
Established
479 SforAiDl/Neural-Voice-Cloning-With-Few-Samples

This repository has implementation for "Neural Voice Cloning With Few Samples"

50
Established
480 palmerabollo/bingspeech-api-client

Microsoft Bing Speech API client in node.js

50
Established
481 vlomme/Multi-Tacotron-Voice-Cloning

Phoneme multilingual(Russian-English) voice cloning based on

50
Established
482 nari-labs/dia

A TTS model capable of generating ultra-realistic dialogue in one pass.

50
Established
483 oseiskar/autosubsync

Automatically synchronize subtitles with audio using machine learning

50
Established
484 NTT123/vietTTS

Vietnamese Text to Speech library

50
Established
485 iver56/torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful...

50
Established
486 Cvandia/nonebot-plugin-fishspeech-tts

适用于nonebot2的fish-speech和fish-audio的tts插件

50
Established
487 ArthurFDLR/whisper-youtube

🔉 Youtube Videos Transcription with OpenAI's Whisper

50
Established
488 Saganaki22/ComfyUI-OmniVoice-TTS

OmniVoice TTS nodes for ComfyUI - Zero-shot multilingual text-to-speech with...

50
Established
489 WhisperSpeech/WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

50
Established
490 AntoBrandi/Robotics-and-ROS-2-Learn-by-Doing-Manipulators

About 3D Printed robot arm powered by ROS 2 and Arduino and controlled via...

50
Established
491 cyberofficial/Synthalingua

Synthalingua - Real Time Translation

50
Established
492 israelg99/deepvoice

Deep Voice: Real-time Neural Text-to-Speech

50
Established
493 dmotz/thing-translator

📷 🗣 Point your camera at things to hear how to say them in a different language

50
Established
494 shangeth/wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for...

50
Established
495 skshadan/TTS-RVC-API

Text to Speech using Coqui TTS + RVC

50
Established
496 hmirin/speechy

A Chrome extension for high-quality Text-to-Speech APIs like Google's...

50
Established
497 sdsds222/Unitale

一个基于Indextts和Qwen3TTS的 AI 有声书制作工具。利用 LLM 自动拆解剧本与识别情绪,集成多角色 TTS...

50
Established
498 FL33TW00D/whisper-turbo

Cross-Platform, GPU Accelerated Whisper 🏎️

50
Established
499 MycroftAI/mimic-recording-studio

Mimic Recording Studio is a Docker-based application you can install to...

50
Established
500 bravekingzhang/text2video

半个神器👉一键文本转视频的工具

50
Established
« Prev 1 2 3 4 5 6 7 80 81 82 Next »