All Voice AI Tools

8,165 tools ranked by quality score · Page 20 of 82

Showing 1901–2000 of 8,165
# Tool Score Tier
1901 habla-liaa/ser-with-w2v2

Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from...

37
Emerging
1902 AlexxIT/FasterWhisper

Faster Whisper for Home Assistant - custom integration with a local...

37
Emerging
1903 zhihanyang2022/gender-audio-classification

A speaker gender classifier. MFC feature engineering and a pre-trained...

37
Emerging
1904 anyvoiceai/Barkify

Barkify: an unoffical training implementation of Bark TTS by suno-ai

37
Emerging
1905 Rongjiehuang/Multiband-WaveRNN

An unofficial implement of autoregressive vocoder Multiband-WaveRNN. Audio...

37
Emerging
1906 daswer123/silero-tts-enhanced

Silero TTS Enhanced is a Python library that enhances the original Silero...

37
Emerging
1907 rerender2021/echo

A simple asr translator powered by avernakis react.

37
Emerging
1908 Mateusz-Dera/whisperspeech-webui

Simple WhisperSpeech web UI

37
Emerging
1909 zsl24/Tacotron2-Mandarin-HiFiGAN

Implementation of TTS with combination of Tacotron2 and HiFi-GAN

37
Emerging
1910 billiax/voxglide

Embeddable voice AI SDK for web pages — speak to fill forms, click buttons,...

37
Emerging
1911 KevKibe/African-Whisper

🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual...

37
Emerging
1912 tcsenpai/audiocoqui

A multilingual tool to convert PDF ebooks to audiobooks using XTTS v2 TTS...

37
Emerging
1913 e-c-k-e-r/vall-e

An unofficial PyTorch implementation of VALL-E

37
Emerging
1914 j3soon/speech-to-windows-input

Perform speech-to-text (STT/ASR) with Azure speech service and simulate...

37
Emerging
1915 asaddi/f5-tts-serve

A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful...

37
Emerging
1916 msalhab96/MultiSpeech

pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with...

37
Emerging
1917 erogol/FFTNet

FFTNet vocoder implementation

37
Emerging
1918 SILMA-AI/silma-tts

SILMA TTS v1 Official Repo — a Lightweight Open Bilingual Text to Speech Model

37
Emerging
1919 Rishav-Agarwal/Translate-Language_Translator

An android app that allows you to translate text and phrases between 90+...

37
Emerging
1920 zabir-nabil/bangla-tts

Bangla text to speech, Multilingual (Bangla, English) real-time speech...

37
Emerging
1921 nestyme/Subtitles-generator

generates transcript for video from link

37
Emerging
1922 Alenkar/kairos-asr

Адаптированный ASR pipeline для удобной интеграции в другие приложения на...

37
Emerging
1923 deepgram-devs/deepgram-demos-rust

Useful demo applications for Deepgram Voice AI APIs, using the Rust language! 🦀

37
Emerging
1924 daanzu/kaldi_ag_training

Docker image and scripts for training finetuned or completely personal Kaldi...

37
Emerging
1925 SpenserCai/cosyvoice3.rs

Python bindings for CosyVoice3 TTS using Candle. Has the characteristics of...

37
Emerging
1926 Yeti47/Vosk4Unity

Vosk4Unity is a module for the Unity Engine that provides a simple way to...

37
Emerging
1927 ORI-Muchim/PolyLangVITS

Multi-speaker Speech Synthesis Using VITS(KO, JA, EN, ZH)

37
Emerging
1928 MartinMashalov/VoiceCloning

Generative voice cloning model using TTS synthesis with state-of-the-art...

37
Emerging
1929 opus-arc/Bibo_No-Aozora

A memory- and algorithm-driven ear-training software (still under active...

37
Emerging
1930 Jobix-Ai/Iso-Vox

STT 90% Solved — Isolate specific speakers from multi-speaker "cocktail...

37
Emerging
1931 hcy71o/MB-iSTFT-VITS-with-AutoVocoder

Incorporating AutoVocoder to MB-iSTFT-VITS

37
Emerging
1932 Adibian/ResGrad

Unofficial implementation of ResGrad: Residual Denoising Diffusion...

37
Emerging
1933 maketheproduct/flowstay

Flowstay is a MacOS app that allows instant transcription across all your...

37
Emerging
1934 wongfei/UEHMI

Unreal Engine Human Machine Interface

37
Emerging
1935 amscotti/hn-podcaster

The HackerNews Podcaster is a JavaScript application that utilizes the power...

37
Emerging
1936 m15-ai/Faster-Local-Voice-AI

A real-time, fully local voice AI system optimized for low-resource devices...

37
Emerging
1937 Sundy1219/eesen-for-thchs30

ASR for Chinese Mandarin

37
Emerging
1938 hipnologo/EchoForge_Studio

Multi-LLM writing and voice production workspace built with Streamlit.

37
Emerging
1939 hypeapps/black-mirror

A voice controlled smart mirror powered by Raspberry Pi3 and AndroidThings.

37
Emerging
1940 outspeed-ai/voice-devtools

Developer tools to debug and build realtime voice agents. Supports multiple models.

37
Emerging
1941 LetsPlayNow/Speech_AI

Speech to speech bot built with Python

37
Emerging
1942 The-Data-Dilemma/ParquetToHuggingFace

ParquetToHuggingFace processes raw audio data, converts it into Parquet...

37
Emerging
1943 lucasnewman/descript-mlx

Implementation of the Descript Audio Codec in MLX

37
Emerging
1944 sshh12/Recording-Bot

A bot built to record and transcribe audio fragments from Discord.

37
Emerging
1945 zolomohan/speech-recognition-in-javascript

Final Code for Speech Recognition in JavaScript tutorial.

37
Emerging
1946 aws-samples/amazon-transcribe-email-workflow

An Amazon Transcribe demo for "speech-to-text" conversion performed through...

37
Emerging
1947 JSON2Video/json2video-nodejs-sdk

Create videos programmatically in the cloud from NodeJS: add watermarks,...

37
Emerging
1948 FomTarro/word-salad

Twitch TTS redeem that uses sentence mixing instead of synthesis.

37
Emerging
1949 aabdurakhmanov/uzbekcha-gapir

Matnni O'zbek tilida talafuz qiluvchi desktop dastur | Text to speech...

37
Emerging
1950 eliangerard/simple-tts-mp3

Converts text to mp3 audio using google-tts-api, it hasn't a limit

37
Emerging
1951 twirapp/silero-tts-api-server

This is a simple server that uses Silero models to convert text to audio...

37
Emerging
1952 andi611/Conditional-SpecGAN-Tensorflow

Text-to-Speech Synthesis by Generating Spectrograms using Generative...

37
Emerging
1953 jesseward/azuretexttospeech

A Go library for Azure's Cognitive Services text-to-speech API.

37
Emerging
1954 valeriorlandini/sonus

A Max/MSP package for sound experimentation and algorithmic composition

37
Emerging
1955 Vazgen005/discord-virtual-micro

Says everything you type in discord for you using ai (Silero Models)

37
Emerging
1956 xeden3/MSSpeechServer

MSSpeechServer is a REST server based on the Microsoft Speech Platform that...

37
Emerging
1957 rshahamiri/SpeechVision

Speech Vision (SV) is a Dysarthric Speech Recognition System that adopts a...

37
Emerging
1958 hhguo/SoCodec

Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications

37
Emerging
1959 TheCodeTraveler/XamSpeak

An iOS and Android app that will dictate text from a photo. XamSpeak...

37
Emerging
1960 DarioFT/ComfyUI-Qwen3-TTS

A ComfyUI custom node suite for Qwen3-TTS, supporting 1.7B and 0.6B models,...

37
Emerging
1961 unilight/jatts

JATTS: A modern, research-oriented Japanese Text-to-speech Open-sourced Toolkit

37
Emerging
1962 Kalebu/Python-Speech-Recognition-

This consist of basic examples of performing Speech Recognition in Python...

37
Emerging
1963 surajondev/text-to-speech

Conver text into speech

37
Emerging
1964 tts-hub/monotonic_alignment_search

Monotonically align text and speech

37
Emerging
1965 AsoSoft/AsoSoft-TTS-Speech-Corpus-for-Central-Kurdish

AsoSoft Speech Corpus for Central-Kurdish Text-To-Speech

37
Emerging
1966 koudounasalkis/AI4Voice

This repo contains the code for "Voice Disorder Analysis: A...

37
Emerging
1967 fano2458/Zhadiger-Kazakh-Language-AI

AI services project "Zhadiger" for Kazakh Language developed using NVIDIA...

37
Emerging
1968 lottev1991/Project-AIdol-Public-English-Dataset

Public female English corpus used for Project AI❤dol

37
Emerging
1969 johnGettings/LIHQ

Long-Inference, High Quality Synthetic Speaker (AI avatar/ AI presenter)

37
Emerging
1970 yousefkotp/Egyptian-Arabic-ASR-and-Diarization

The official submission from Speech Squad team for the MTC-AIC 2 competition...

37
Emerging
1971 NONAN23x/WhisperingNova

An AI voice changer harnessing the power of Open AI and VoiceVox for...

37
Emerging
1972 clarinsi/Slovene_ASR_e2e

Automatic Speech Recognition tool

37
Emerging
1973 lars76/forced-alignment-chinese

Mandarin Chinese audio datasets aligned with Montreal Forced Aligner

37
Emerging
1974 sp-squared/Turkic-Languages-Audio-to-Text-Transcription

Open-source Automatic Speech Recognition (ASR) pipeline for Bashkir...

37
Emerging
1975 binzhouchn/masr

中文语音识别系列,读者可以借助它快速训练属于自己的中文语音识别模型,或直接使用预训练模型测试效果。

37
Emerging
1976 troykelly/live-news-break

An advanced tool designed for creating automated news bulletins. It...

37
Emerging
1977 mikopbx/ModuleRHVoice

Text to speech voice generator by the RHVoice algoritm

37
Emerging
1978 XilinJia/Podcini

Open source podcast instrument for Android supporting contents from YouTube...

37
Emerging
1979 nemoramo/acoustic_model

This is a sub-repository in building to create acoustic model in Mandarin...

37
Emerging
1980 revsic/speechset

Numpy-librosa implementation of Speech dataset pipeline

37
Emerging
1981 OPEXGroup/ITCC.YandexSpeechKitClient

Cross-platform client for Yandex SpeechKit Cloud API

37
Emerging
1982 zassou65535/VITS

VITSによるテキスト読み上げ器&ボイスチェンジャー

37
Emerging
1983 kcitlyn/PolyScribe_Desktop

Fully-offline transcription and translator w/ speech-to-text and...

37
Emerging
1984 betaoverflow/donna

Transform your smart devices to intelligent communicators.

37
Emerging
1985 xnmeet/voi

一个基于 [Bob](https://bobtranslate.com/) 的文本转语音插件,使用 Kokoro 本地部署模型作为语音合成服务。

37
Emerging
1986 hamzaehsan97/Speech_Recognition_CNN

CNN (Convolutional Neural Networks) Speech Recognition

37
Emerging
1987 vectominist/End-to-end-ASR-Pytorch-DLHLP

Joint CTC-Attention End-to-end Speech Recognition - PyTorch Implementation...

37
Emerging
1988 black-roland/homeassistant-salutespeech

SaluteSpeech integration for Home Assistant providing speech-to-text and...

37
Emerging
1989 AlimTleuliyev/image-to-audio

Image Captioning and Text-to-Speech

37
Emerging
1990 LlmKira/fast-langdetect

⚡️ 80x faster Fasttext language detection out of the box | Split text by language

37
Emerging
1991 mravanelli/pySpeechRev

This python code performs an efficient speech reverberation starting from a...

37
Emerging
1992 reyniel26/bleepy

Bleepy is a Python program that can block Tagalog and English profanity in...

37
Emerging
1993 kaloprojects/KALO-ESP32-Voice-Chat-AI-Friends

ESP32-based voice device for chatting with multiple custom AI bots....

37
Emerging
1994 Wendison/FCL-taco2

Official implementation of FCL-taco2: Fast, Controllable and Lightweight...

37
Emerging
1995 DeutscheKI/tevr-asr-tool

State-of-the-art (ranked #1 Aug 2022) German Speech Recognition in 284 lines...

37
Emerging
1996 geekgirljoy/PHP

Examples of my PHP Code

37
Emerging
1997 tuanh123789/AdaSpeech

An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for...

37
Emerging
1998 nikhilunni/demucs-rs

Rust powered waveform source separation

37
Emerging
1999 duckysmacky/signsense

An Android app for translating dactyl sign language into text

37
Emerging
2000 iGerman00/Pollyduble

An experimental proof-of-concept script to automatically dub videos to...

37
Emerging
« Prev 1 2 3 18 19 20 21 22 80 81 82 Next »