All Voice AI Tools

8,165 tools ranked by quality score · Page 24 of 82

Showing 2301–2400 of 8,165
# Tool Score Tier
2301 xiaominfc/aliyun_nls_c_demo

阿里云的实时语音识别服务(ASR)没有提供C的SDK,项目中需要,看了它java sdk的实现,就做了个C版demo

35
Emerging
2302 a-n-rose/Python-Sound-Tool

SoundPy (alpha stage) is a research-based python package for speech and...

35
Emerging
2303 renaudjenny/TellTime

iOS application to tell the time in the British way 🇬🇧⏰

35
Emerging
2304 jreremy/conformer

Pytorch implementation of conformer with with training script for end-to-end...

35
Emerging
2305 SohamRatnaparkhi/Voice-Assistant

Voice Assistant coded in Python!

35
Emerging
2306 MingLunHan/CIF-PyTorch

[ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech...

35
Emerging
2307 alitahir4024/Text-To-Speach-Javascript

A creative project to give voice to your words.

35
Emerging
2308 huuquyet/PhoWhisper-next

Demo using PhoWhisper models of VinAI built with Transformers.js + Next.js

35
Emerging
2309 PareekshithPalat/AETHER---Personal-Assistant

AETHER is a voice-activated Python personal assistant that responds to...

35
Emerging
2310 holgern/pykokoro

A Python library for Kokoro TTS (Text-to-Speech) using ONNX runtime.

35
Emerging
2311 manhph2211/ML-Deployment

Pushing Deep Learning models into production using torchserve, kubernetes...

35
Emerging
2312 aria-music/zundacord

Japanese Text-to-speech bot for Discord, powered by VOICEVOX

35
Emerging
2313 Aculeasis/rhvoice-proxy

High-level interface for RHVoice library

35
Emerging
2314 tuanio/noisy-student-training-asr

Pytorch implementation of Noisy Student Training for Automatic Speech...

35
Emerging
2315 efeslab/LiteASR

[EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with...

35
Emerging
2316 IS2AI/ISSAI_SAIDA_Kazakh_ASR

the first industrial-scale open-source Kazakh speech corpus. KSC2 corpus...

35
Emerging
2317 mathigatti/RealTimeSingingSynthesizer

Live Coding Singing Synthesizer. Python sinsy-NG wrapper.

35
Emerging
2318 chaonan99/ppt_presenter

Convert ppt to video with audio track, using text to speech synthesis

35
Emerging
2319 LlmKira/VitsServer

🌻 VITS ONNX TTS server designed for fast inference 🔥

35
Emerging
2320 andriyadi/Maix-SpeechRecognizer

Speech Recognition or Wake Word detection demo, developed using Maixduino...

35
Emerging
2321 rishikksh20/AudioMAE-pytorch

Unofficial PyTorch implementation of Masked Autoencoders that Listen

35
Emerging
2322 thotnd173389/SpeechCommand

The project aims to use keyword spotting streaming in a real-time offline...

35
Emerging
2323 fcakyon/pywhisper

openai/whisper + extra features

35
Emerging
2324 h4rm0n1c/NetTTS

A Retro-modern SAPI 4.0 TTS Client with Network Connectivity and custom...

35
Emerging
2325 kwea123/Unity_live_caption

Use Google Speech-to-Text API to do real-time live stream caption on Unity!...

35
Emerging
2326 18F/tts-buy-cloudgov-vulnerability-scanner

Solicitation and acquisition documents created for the cloud.gov...

35
Emerging
2327 sinProject-Inc/talk

Listening and Speaking

35
Emerging
2328 Ikaros-521/FunASR_WS

基于FunASR官方Demo修改的WS服务端,配合FastAPI提供HTTP服务,可以在浏览器中进行实时ASR测试

35
Emerging
2329 Lqm1/openai-workers-ai

A Cloudflare Workers-based, OpenAI-compatible API project that provides...

35
Emerging
2330 ryanlintott/OEVoice

Old English text-to-speech using AVSpeechSynthesis and IPA pronunciations.

35
Emerging
2331 sooftware/End-to-End-Speech-Recognition-Models

PyTorch implementation of automatic speech recognition models.

35
Emerging
2332 jashutch/zeddal

Turn your voice into intelligent, linked notes inside Obsidian

35
Emerging
2333 GravityPoet/ChordVox

Your voice is the fastest keyboard. Local AI voice input — speak, AI polish,...

35
Emerging
2334 litagin02/vits-japros-webui

日本語TTS(VITS)の学習と音声合成のGradio WebUI

35
Emerging
2335 rollingstarky/Python-Voice-Assistant

A Python based Voice Assistant like Siri

35
Emerging
2336 cosmoquester/speech-recognition

Develop speech recognition models with Tensorflow 2

35
Emerging
2337 pinch-eng/pinch-python-sdk

Real-time voice translation SDK

35
Emerging
2338 tjunttila/pdf2video

A tool for making videos from PDF presentations.

35
Emerging
2339 m15-ai/Local-Voice

A real-time, offline voice assistant for Linux and Raspberry Pi. Uses local...

35
Emerging
2340 simonesiega-academics/culinary-ai-assistant

AI-powered culinary assistant that stores structured data in a tabular...

35
Emerging
2341 emiliioaguirre/youtube-live-tts

Real-time YouTube Live Chat Text-to-Speech (TTS) using ElevenLabs AI voices

35
Emerging
2342 IOriens/whisper-video

Generate subtitles for all the videos in a folder with OpenAI's Whisper...

35
Emerging
2343 jaganadhg/nemoexamples

Experiments with NVIDIA NeMo

35
Emerging
2344 Ananya-0306/Jarvis-desktop-assistant

This is the New Jarvis AI Project it will do some functionality followed by...

35
Emerging
2345 robotology/natural-speech

This repository contains a codebase to build automatic speech recognition...

35
Emerging
2346 LEMAS-Project/LEMAS-TTS

LEMAS‑TTS is a multilingual zero‑shot text‑to‑speech system, supporting 10...

35
Emerging
2347 EtienneAb3d/WhisperTimeSync

Synchronize Whisper's timestamps over an existing accurate transcription

35
Emerging
2348 elbruno/ElBruno.QwenTTS

Qwen3-TTS ONNX export pipeline + C# .NET 10 console app for local voice generation

35
Emerging
2349 DivineUX23/Audio-to-Audio-translation

Imagine translating your speech or anybody's speech to any language you want...

35
Emerging
2350 matlab-deep-learning/deepspeech

This repo provides the pretrained DeepSpeech model in MATLAB. The model is...

35
Emerging
2351 cadia-lvl/WebRICE

WebRICE (Web Reader ICE) is an open source web reader in development at...

35
Emerging
2352 Mokkapps/parents-soundboard

A soundboard developed for parents to be able to play often needed phrases like "No"

35
Emerging
2353 jlia0/RealityTalk

RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling

35
Emerging
2354 Sukumar9944/Speech-to-Text-with-ChatGPT

This Python application combines speech recognition with the power of...

35
Emerging
2355 speechly/react-example-repo-filtering

An example app for filtering data with Speechly and React

35
Emerging
2356 hkdb/offline-tts

A Chrome extension that reads web pages and PDFs aloud using Supertonic's...

35
Emerging
2357 Zuellni/LLaSA-WebUI

LLaSA WebUI using ExLlamaV2 and FastAPI.

35
Emerging
2358 xuchennlp/S2T

The project for speech translation

35
Emerging
2359 i-bardinov/Godot-Android-Text-to-Speech

Godot Android Text to Speech plugin for Godot Engine 3.4 or higher

35
Emerging
2360 ARK018/multi-voice-sdk

A universal Text-to-Speech (TTS) SDK . Easily generate and manage audio...

35
Emerging
2361 18F/tts-buy-code-review

Solicitation documents for the code review procurement being undertaken by TTS.

35
Emerging
2362 WelkinYang/Learn2Sing2.0

Diffusion and Mutual Information-Based Target Speaker SVS by Learning from...

35
Emerging
2363 StanGirard/speechdigest

Audio to summary with openAI Whisper & GPT 3.5/4 using streamlit

35
Emerging
2364 opencog/TinyCog

Small Robot, Toy Robot platform

35
Emerging
2365 nearkyh/AWS-Polly

How to use Amazon Polly TTS(Text To Speech)

35
Emerging
2366 FS-17/SpeechDataBuilder

Browser-based open-source tool for creating high-quality TTS/STT datasets....

35
Emerging
2367 LianjiaTech/bella-whisper

bella-whisper是一系列基于OpenAI...

35
Emerging
2368 seven-io/go-client

Official Go API Client for seven.io

35
Emerging
2369 DarmorGamz/Youtube-Shorts-Generator

Harness OpenAI's power to effortlessly create YouTube Shorts with this...

35
Emerging
2370 alexykn/TorchTS

A modern text to speech frontend for Kokoro-82M

35
Emerging
2371 stensmir/mimir

Offline voice-to-text for macOS. No cloud, no tracking.

35
Emerging
2372 indigane/wyoming-android-tts

Use your Android device's TTS engines in Home Assistant via the Wyoming protocol.

35
Emerging
2373 Garden-Tree/yomi-KAI

yomi-KAIはDiscordのテキストチャンネルに送られた文章をボイスチャンネルで読み上げるbotです。

35
Emerging
2374 WindQAQ/tensorflow-wavenet

Implementation of WaveNet network based on Tensorflow.

35
Emerging
2375 SingAvi/SpeechToText

Simple python script to convert live speech or any audio file to text using...

35
Emerging
2376 VirtualZer0/StreamTalkerClient

Cross-platform desktop app that reads Twitch and VK Play chat aloud using AI...

35
Emerging
2377 bobokick/Microsoft-Speech-API_Guide

微软的语音引擎SAPI的使用及API描述

34
Emerging
2378 lucidprogrammer/youtube-vision-transcriber

AI-powered pipeline that converts YouTube videos into polished articles...

34
Emerging
2379 ElishaAz/mau_local_stt

A Maubot to transcribe audio messages using local open-source libraries

34
Emerging
2380 yuryleb/garmin-russian-tts-voices

Дополнения и исправления для русских TTS-голосов из навигаторов Garmin

34
Emerging
2381 mhshajib/avro-phonetic-go

Avro-style Banglish → বাংলা transliteration engine for Go, using trie-based...

34
Emerging
2382 JohannLai/audio-to-text

Convert audio to text and summary just need to input the audio link.

34
Emerging
2383 leokwsw/OpenAI-TTS-Gradio

Use OpenAI TTS(Text to Speech) API with Gradio

34
Emerging
2384 taeyoun811/Whisfusion

Whisfusion: Parallel ASR Decoding via a Diffusion Transformer

34
Emerging
2385 danielga/gmcl_speech

A module for Garry's Mod that provides speech recognition interfaces to developers.

34
Emerging
2386 voice-cloning-app/Voice-API

API template for deploying tacotron2 voices

34
Emerging
2387 daveshap/keras_asr

ASR experiment using Google's Universal Sentence Encoder

34
Emerging
2388 Voice-Privacy-Challenge/Voice-Privacy-Challenge-2022

Baseline Recipe for VoicePrivacy Challenge 2022: anonymization systems and...

34
Emerging
2389 tabahi/Mel-Spectrum-Analyzer

Online web based mel-spectrum, power spectrum, FFT analyzer for speech and...

34
Emerging
2390 RoyNkem/SwiftUI-AI-Voice-Assistant

A multi-platform app for voice-based interactions built using SwiftUI with...

34
Emerging
2391 sureshnswamy/tamil-text2voice

Text to speech tool for Tamil language

34
Emerging
2392 hari-huynh/viVQA-voice-assistant

Voice assistant using Multimodal LLMs - LLaVA-NeXT (Mistral 7B) finetuned &...

34
Emerging
2393 msalhab96/SpeeQ

A framework for automatic speech recognition

34
Emerging
2394 GreenSheep01201/claw-voice-chat

Push-to-talk voice chat interface for OpenClaw channels

34
Emerging
2395 uysalemre/Voice-Mail

Python, Django, Text to Speech, Speech to Text, AJAX, Gmail API, Email...

34
Emerging
2396 adasegroup/OSM-one-shot-multispeaker

Framework for one-shot multispeaker system based on Deep Learning

34
Emerging
2397 T-vK/Termux-DeepSpeech

Open source offline speech recognition for Android using Mozilla's...

34
Emerging
2398 ttuleyb/TortoiseTTS-GUI

GradioUI for TortoiseTTS voice generation

34
Emerging
2399 ekleziast/kiwi-voice

Voice interface for OpenClaw with speaker recognition, voice-gated security,...

34
Emerging
2400 rishikksh20/TalkNet2-pytorch

TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for...

34
Emerging
« Prev 1 2 3 22 23 24 25 26 80 81 82 Next »