All Voice AI Tools

8,165 tools ranked by quality score · Page 9 of 82

Showing 801–900 of 8,165
# Tool Score Tier
801 pinguy/kokoro-tts-addon

Local neural TTS for Browsers: fast, expressive, and offline—runs on modest hardware.

46
Emerging
802 oliverguhr/wav2vec2-live

A live speech recognition using Facebooks wav2vec 2.0 model.

46
Emerging
803 shashikg/WhisperS2T

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting...

46
Emerging
804 tarun7r/Vocal-Agent

Cascading voice assistant combining real-time speech recognition, AI...

46
Emerging
805 drien/tts-joinery

Stitch together text-to-speech over 4096 characters via the OpenAI API

46
Emerging
806 duncan3dc/speaker

A PHP library to convert text to speech using various web services

46
Emerging
807 ishandutta2007/Awesome-Text-to-Speech

🎤 A curated list of the latest and most influential tools, models, and...

46
Emerging
808 black-roland/homeassistant-yandex-speechkit

Yandex SpeechKit integration for Home Assistant providing speech-to-text and...

46
Emerging
809 Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper...

46
Emerging
810 HKoon/ChatTTS-OpenVoice

Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your...

46
Emerging
811 leaonline/easy-speech

🔊 Cross browser Speech Synthesis also known as Text to speech or TTS; no...

46
Emerging
812 chenkui164/FastASR

这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。...

46
Emerging
813 neosapience/mlp-singer

Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing...

46
Emerging
814 deepgram-starters/node-text-to-speech

Get started using Deepgram's Text-to-Speech with this Node demo app

46
Emerging
815 patrickenfuego/Chapterize-Audiobooks

Split a single, monolithic mp3 audiobook file into chapters using Machine...

46
Emerging
816 joethei/obsidian-tts

Text to speech for Obsidian. Hear your notes.

46
Emerging
817 tabahi/formantfeatures

Extract frequency, power, width and dissonance of formants from wav files

46
Emerging
818 areebbeigh/winspeech

Speech recognition and synthesis library for Windows - Python 2 and 3.

46
Emerging
819 jhuus/HawkEars1

⚠️ HawkEars 1.0 (obsolete). See HawkEars 2.0 → https://github.com/jhuus/HawkEars

46
Emerging
820 morioka/tiny-openai-whisper-api

OpenAI Whisper API-style local server, runnig on FastAPI

46
Emerging
821 eduardolat/kokoro-web

🔊 Kokoro Web: Free AI text-to-speech, online or self-hosted, OpenAI compatible!

46
Emerging
822 mpaepper/vibevoice

Fast local speech-to-text for any app using faster-whisper

46
Emerging
823 dhruvyad/uttertype

Short code for dictation using OpenAI Whisper for transcription.

46
Emerging
824 Chris10M/Lip2Speech

A pipeline to read lips and generate speech for the read content, i.e Lip to...

46
Emerging
825 PaddlePaddle/Parakeet

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer...

46
Emerging
826 sldimitrov/english_learning_system

English Learning System I have developed in order to help others in...

46
Emerging
827 HadrienGardeur/web-speech-recommended-voices

A list of recommended voices for the Web Speech API

46
Emerging
828 pritishyuvraj/Voice-Conversion-GAN

Voice Conversion using Cycle GAN's For Non-Parallel Data

46
Emerging
829 gtreshchev/RuntimeSpeechRecognizer

Cross-platform, real-time, offline speech recognition plugin for Unreal...

46
Emerging
830 yc9701/pansori

Tools for ASR Corpus Generation from Online Video

46
Emerging
831 KernelInterrupt/whisper4dart

whisper4dart is a dart wrapper for whisper.cpp, designed to offer an...

46
Emerging
832 pluja/whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with...

46
Emerging
833 BuildWithAIs/voicekey

Voice to text, one key to input.

46
Emerging
834 adi-gov-tw/Taiwan-Tongues-ASR-CE

Taiwan Tongues ASR CE 是一個開源語音辨識(Automatic Speech Recognition,...

46
Emerging
835 ide8/tacotron2

Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow

46
Emerging
836 jianchang512/clone-voice

A sound cloning tool with a web interface, using your voice or any sound to...

46
Emerging
837 isomoes/blivedm_rs

一个功能强大的 Bilibili 直播间弹幕 WebSocket 客户端 Rust 库,支持实时弹幕监控、文字转语音(TTS)和浏览器 Cookie...

46
Emerging
838 Quantatirsk/funasr-api

Speech recognition API service powered by FunASR and Qwen-ASR, supporting 52...

46
Emerging
839 voicegain/python-sdk

Python SDK for working with Voicegain Speech-to-Text

46
Emerging
840 slp-rl/aero

This repo contains the official PyTorch implementation of "Audio Super...

46
Emerging
841 roboticslab-uc3m/speech

Text To Speech (TTS) and Automatic Speech Recognition (ASR).

46
Emerging
842 JoelShine/JARVIS-AI-ASSISTANT

A true Artificial Intelligent Assistant with ALICE as backend and offline...

46
Emerging
843 mrf345/django_gtts

Django app extension to add gTTS google text-to-speech

46
Emerging
844 Jaffe2718/Microphone-Text-Input

A fabric mod that can recognize speech as text messages and automatically...

46
Emerging
845 d4n3436/GTranslate

A collection of free translation APIs (Google Translate, Bing Translator,...

46
Emerging
846 fishaudio/docs

Official documentation for products, services, and projects by Fish Audio

46
Emerging
847 AdroitAnandAI/Indian-Accent-Speech-Recognition

Traditional ASR (Signal & Cepstral Analysis, DTW, HMM) & DNNs (Custom Models...

46
Emerging
848 libdriver/ld3320

LD3320 full-featured driver library for general-purpose MCU and Linux.

46
Emerging
849 halfzm/v2vt

video to video translation with voice clone and lip...

46
Emerging
850 undertheseanlp/automatic_speech_recognition

Vietnamese Automatic Speech Recognition

46
Emerging
851 ai-adv-lab/deepspeech.mxnet

A MXNet implementation of Baidu's DeepSpeech architecture

46
Emerging
852 crlandsc/torch-log-wmse

logWMSE, an audio quality metric & loss function with support for digital...

46
Emerging
853 Aivis-Project/aivmlib-web

Aivis Voice Model File (.aivm/.aivmx) Utility Library for Web

46
Emerging
854 lukeewin/FunASR_API

这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech...

46
Emerging
855 Open-Speech-EkStep/vakyansh-wav2vec2-experimentation

Repository containing experimentation platform on how to train, infer on...

46
Emerging
856 eigenpunk/ComfyUI-audio

some generative audio tools for ComfyUI

46
Emerging
857 atomicoo/FCH-TTS

A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese,...

46
Emerging
858 npuichigo/waveglow

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network...

46
Emerging
859 ayutaz/piper-plus

Multilingual neural TTS (6 languages: JA/EN/ZH/ES/FR/PT) with VITS...

46
Emerging
860 deepgram-starters/flask-voice-agent

Flask WebSocket proxy for Deepgram's Voice Agent API

46
Emerging
861 aqiu202/aqiu-spring-boot-starter-projects

个人封装的一些开箱即用的Spring Boot Starter组件,简单且实用,后续会根据需求进行持续扩展!

46
Emerging
862 Plachtaa/VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model....

46
Emerging
863 HardCodeDev777/UnityNeuroSpeech

The world’s first game framework that lets you talk to AI in real time —...

46
Emerging
864 ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous...

46
Emerging
865 hasscc/hass-edge-tts

🗣️ Microsoft Edge TTS for Home Assistant, no need for app_key

46
Emerging
866 rioharper/VocalForge

Your one-stop solution for voice dataset creation

45
Emerging
867 CSTR-Edinburgh/magphase

MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications.

45
Emerging
868 rhasspy/piper

A fast, local neural text to speech system

45
Emerging
869 italankin/samplevoicebot

TTS Telegram bot

45
Emerging
870 livingingroups/animal2vec

animal2vec: A self-supervised transformer for rare-event raw audio input

45
Emerging
871 RaduBolbo/F5-TTS-Emotional-CFG

Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class...

45
Emerging
872 baizeteam/baize-toolbox

白泽工具箱,基于electron+ffmpeg实现的一款功能强大的多媒体工具

45
Emerging
873 Kaljurand/Inimesed

An Android app that lets you search your contacts by voice. Internet not...

45
Emerging
874 KevinMIN95/StyleSpeech

Official implementation of Meta-StyleSpeech and StyleSpeech

45
Emerging
875 MysteryPancake/Discord-TTS

Text to speech Discord bot using FakeYou

45
Emerging
876 chaiyujin/dctts-pytorch

The pytorch implementation of DC-TTS

45
Emerging
877 hash2430/pitchtron

TTS for pitch-accented language. Korean dialect DB.

45
Emerging
878 nipponjo/tts-arabic-pytorch

🎙️ Arabic TTS models (Tacotron2, FastPitch)

45
Emerging
879 yaph/tts-samples

This repository provides text-to-speech (TTS) audio samples in MP3 format...

45
Emerging
880 ccoreilly/LocalSTT

Android Speech Recognition Service using Vosk/Kaldi and Mozilla DeepSpeech

45
Emerging
881 NATSpeech/NATSpeech

A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official...

45
Emerging
882 fewieden/MMM-voice

Offline Voice Recognition Module for MagicMirror²

45
Emerging
883 chrisjp/tts

A simple tool to demo text-to-speech using various services' voices. HTML5...

45
Emerging
884 ae9is/subtitle-chan

Live speech transcription and translation in your browser

45
Emerging
885 nnsvs/nnsvs

Neural network-based singing voice synthesis library for research

45
Emerging
886 OAID/cortex-m-kws

Cortex M KWS example with Tengine Lite.

45
Emerging
887 Spac5y/Vocal-Agent

A cutting-edge Cascading voice assistant combining real-time speech...

45
Emerging
888 lucadellalib/focalcodec

A low-bitrate single-codebook 16 / 24 kHz speech codec based on focal modulation

45
Emerging
889 Niger-Volta-LTI/yoruba-text

Yorùbá language training text for NLP, ASR and TTS tasks

45
Emerging
890 Voine/Bert-VITS2-MNN

TTS System Bert-VITS2 Android Ver, powered by alibaba-MNN engine.

45
Emerging
891 ubisoft/ubisoft-laforge-daft-exprt

Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

45
Emerging
892 daniilrobnikov/vits2

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with...

45
Emerging
893 aeleraqi/Text-to-Speech-gTTS---Arabic-text

Google Text-to-Speech API to convert text input into audio files

45
Emerging
894 georgesterpu/avsr-tf1

Audio-Visual Speech Recognition using Sequence to Sequence Models

45
Emerging
895 see2023/Bert-VITS2-ext

基于Bert-VITS2做的表情、动画测试. Animation testing based on Bert-VITS2.

45
Emerging
896 atomiechen/FunASR-Client

Really easy-to-use Python client for FunASR runtime server.

45
Emerging
897 by2101/OpenASR

A pytorch based end2end speech recognition system.

45
Emerging
898 gfdb/wav2aug

A general purpose task-agnostic speech augmentation policy

45
Emerging
899 rishikksh20/vae_tacotron2

VAE Tacotron 2, an alternative of GST Tacotron

45
Emerging
900 sanchit-gandhi/whisper-jax

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

45
Emerging
« Prev 1 2 3 7 8 9 10 11 80 81 82 Next »