All Voice AI Tools
8,165 tools ranked by quality score · Page 18 of 82
| # | Tool | Score | Tier |
|---|---|---|---|
| 1701 |
dennisbergevin/cypress-voice-plugin
Cypress plugin to announce spec result and time in Cypress Test Runner |
|
Emerging |
| 1702 |
spokestack/spokestack-python
Spokestack is a library that allows a user to easily incorporate a voice... |
|
Emerging |
| 1703 |
supikiti/PNCC
A implementation of Power Normalized Cepstral Coefficients: PNCC |
|
Emerging |
| 1704 |
Speaker-Identification/You-Only-Speak-Once
Deep Learning - one shot learning for speaker recognition using Filter Banks |
|
Emerging |
| 1705 |
DrewThomasson/ebook2audiobookSTYLETTS2
This simple program makes use of Calibre to convert a ebook into chapters... |
|
Emerging |
| 1706 |
atpritam/Free-Scribe
ML-integrated transcription & translation react app · Next.js 15 + React 19... |
|
Emerging |
| 1707 |
wahyd4/say-it
TTS in command line -- Pronounce the Chinese and English words you typed in. |
|
Emerging |
| 1708 |
stefantaubert/pronunciation-dictionary-utils
Utils to modify pronunciation dictionaries. |
|
Emerging |
| 1709 |
Harium/espeak-java
espeak java wrapper |
|
Emerging |
| 1710 |
abdozmantar/ComfyUI-DeepExtractV2
DeepExtractV2 – lightning-fast, high-quality audio separator. Instantly... |
|
Emerging |
| 1711 |
cameronking4/openai-realtime-blocks
Voice AI components using OpenAI Realtime API to copy and paste into your... |
|
Emerging |
| 1712 |
ziligy/watson-text-talker
Simple python Text-to-Speech Interface using IBM's Watson TTS |
|
Emerging |
| 1713 |
lang-uk/ukrainian-tts-preprocessing
Tools and models for Ukrainian phonemization and lexical stress prediction |
|
Emerging |
| 1714 |
deepgram-starters/csharp-voice-agent
Get started using Deepgram's Voice Agent with this C# demo app |
|
Emerging |
| 1715 |
srinivr/kaldi-long-audio-alignment
Long audio alignment using Kaldi |
|
Emerging |
| 1716 |
sovse/Rus-SpeechRecognition-LSTM-CTC-VoxForge
Распознавание речи русского языка используя Tensorflow, обучаясь на базе Voxforge |
|
Emerging |
| 1717 |
LuckyHookin/edge-TTS-record
一个可以录制 Microsoft Edge 浏览器的语音合成(TTS)语音并输出为 .wav 音频的(windows平台)工具。 |
|
Emerging |
| 1718 |
GmEsoft/SP0256_CTS256A-AL2
G.I./Microchip SP0256 Speech Processor and CTS256A-AL2 Text-To-Speech... |
|
Emerging |
| 1719 |
balisujohn/tortoise.cpp
A ggml (C++) re-implementation of tortoise-tts |
|
Emerging |
| 1720 |
elbruno/ElBruno.Realtime
Pluggable real-time audio conversation framework for .NET. Local VAD, STT,... |
|
Emerging |
| 1721 |
tariqjamel/Flutter-Chat-Bot
A Flutter-based AI chatbot that allows interaction through text, voice, and... |
|
Emerging |
| 1722 |
JollyToday/GhostCut-auto_video_translation
auto video translation-video translator can auto translate video hard... |
|
Emerging |
| 1723 |
keonlee9420/DailyTalk
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational... |
|
Emerging |
| 1724 |
playht/text-to-speech-api
Play.ht's Text to Speech API |
|
Emerging |
| 1725 |
nishanth-kj/VoxLabs
Text to Speech |
|
Emerging |
| 1726 |
jhubbardsf/svelte-speech-recognition
Speech recognition library for Svelte |
|
Emerging |
| 1727 |
nature-heart-software/izabela
Your speech assistant. Communicate with text-to-speech in games, on voice... |
|
Emerging |
| 1728 |
AkshathRaghav/tinyspeech
Code release for "TinySpeech: Attention Condensers for Deep Speech... |
|
Emerging |
| 1729 |
HarunoriKawano/Wav2vec2.0
Implementation of the paper "wav2vec 2.0: A Framework for Self-Supervised... |
|
Emerging |
| 1730 |
AudioLLMs/AudioBench
AudioBench: A Universal Benchmark for Audio Large Language Models |
|
Emerging |
| 1731 |
muhammadGagah/native-speech-generation
Add-on NVDA untuk mengubah teks menjadi suara alami dengan Google Gemini AI. |
|
Emerging |
| 1732 |
jamditis/audiobash
Voice-controlled terminal for developers. Speak commands, execute instantly. |
|
Emerging |
| 1733 |
john-carroll-sw/coffee-chat-voice-assistant
Coffee Chat Voice Assistant is a voice-driven ordering system powered by... |
|
Emerging |
| 1734 |
tugstugi/pytorch-speech-commands
Speech commands recognition with PyTorch | Kaggle 10th place solution in... |
|
Emerging |
| 1735 |
chrisurf/obsidian-voice
🔊 The Obsidian Voice plugin lets you listen to your written content being... |
|
Emerging |
| 1736 |
MiniMax-AI/MiniMax-AI.github.io
The official GitHub Page for MiniMax |
|
Emerging |
| 1737 |
USStateDept/State-TalentMAP-API
Source Code - https://github.com/USStateDept/State-TalentMAP |
|
Emerging |
| 1738 |
zthxxx/python-Speech_Recognition
A simple example for use speech recognition baidu api with python. |
|
Emerging |
| 1739 |
samuelbradshaw/text-to-timestamps
Python and command-line utility for aligning audio to a transcript. |
|
Emerging |
| 1740 |
mayeaux/generate-subtitles
Generate transcripts for audio and video content with a user friendly UI,... |
|
Emerging |
| 1741 |
nihui/ncnn-android-piper
ncnn android piper the fast and local neural text-to-speech engine |
|
Emerging |
| 1742 |
analyticsinmotion/decibri-web
Cross-browser microphone capture for the web. Zero dependencies. |
|
Emerging |
| 1743 |
phineas-pta/fine-tune-whisper-vi
jupyter notebooks to fine tune whisper models on Vietnamese using Colab... |
|
Emerging |
| 1744 |
alessandroragano/scoreq
SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024) |
|
Emerging |
| 1745 |
TranscribeJs/transcribe.js
Monorepo for Transcribe.js |
|
Emerging |
| 1746 |
google-research-datasets/TextNormalizationCoveringGrammars
Covering grammars for English and Russian text normalization |
|
Emerging |
| 1747 |
zakuro-ai/asr
ASRDeepspeech x Sakura-ML (English/Japanese) with deepspeech2 model in... |
|
Emerging |
| 1748 |
jcrodriguez1989/heyshiny
Package: New `shiny` input that translates audio to text |
|
Emerging |
| 1749 |
appurist/say2file
This utility uses either ElevenLabs or IBM's Watson AI text-to-speech API to... |
|
Emerging |
| 1750 |
georgesterpu/Taris
Transformer-based online speech recognition system with TensorFlow 2 |
|
Emerging |
| 1751 |
cvqluu/TDNN
Time delay neural network (TDNN) implementation in Pytorch using unfold method |
|
Emerging |
| 1752 |
xenova/kokoro-web
ML-powered speech synthesis directly in your browser |
|
Emerging |
| 1753 |
susilnem/American-sign-Language
A CNN based human computer interface for American Sign Language recognition... |
|
Emerging |
| 1754 |
sglkc/tts-api
Free, minimal, unlimited*, CORS-friendly Google Translate Text to Speech API... |
|
Emerging |
| 1755 |
p-groarke/wsay
Windows "say" |
|
Emerging |
| 1756 |
saky-semicolon/Emotion-Aware-AI-Support-System
A smart AI-powered platform that detects emotions from student voice input,... |
|
Emerging |
| 1757 |
MarkParker5/STARK-PLACE
S.T.A.R.K. Platform Library and Community Extensions |
|
Emerging |
| 1758 |
OlivierMary/MySuperWhisper
A global voice dictation tool for Linux using local OpenAI Whisper. Fast,... |
|
Emerging |
| 1759 |
AshutoshDongare/convo
Open source voice bot for Humanoid Robots and virtual digital humans |
|
Emerging |
| 1760 |
jopedroliveira/speech_recog_uc
Speech processing ROS-package. Performs speech recognition and estimates the... |
|
Emerging |
| 1761 |
IBM/mic-sts-nlu-weather-tone-analyzer
# WARNING: This repository is no longer maintained :warning: > This... |
|
Emerging |
| 1762 |
HawkAaron/E2E-ASR
PyTorch Implementations for End-to-End Automatic Speech Recognition |
|
Emerging |
| 1763 |
IPS-LMU/transcription-portal
A portal that offers a transcription chain for multi upload and processing... |
|
Emerging |
| 1764 |
chenwr727/Stock-Insight-AI
Stock-Insight-AI 一键生成股票与期货分析视频 |
|
Emerging |
| 1765 |
jhermann/kopfkino
Syntactic sugar sprinkled on top of MoviePy and AI components to allow... |
|
Emerging |
| 1766 |
MarkParker5/STARK
S.T.A.R.K. - Speech And Text Algorithmic Recognition Kit |
|
Emerging |
| 1767 |
lmangani/docker-rtpengine-speech
OpenSIPS + RTPEngine Recording + Speech Recognition in HEP |
|
Emerging |
| 1768 |
Troyanovsky/awesome-TTS-Colab
Collection of awesome TTS and voice cloning models to run with Google Colab |
|
Emerging |
| 1769 |
gokhaneraslan/tts-dataset-generator
With this tool you can create custom TTS dataset from video or audio. |
|
Emerging |
| 1770 |
oren-cohen/whatsmybitrate
Whatsmybitrate analyzes audio files for quality metrics such as bit rate,... |
|
Emerging |
| 1771 |
Sciss/SpeechRecognitionHMM
Exported from... |
|
Emerging |
| 1772 |
khanld/ASR-Wav2vec-Finetune
:zap: Finetune Wa2vec 2.0 For Speech Recognition |
|
Emerging |
| 1773 |
led-mirage/VoivoClip
VOICEVOXでクリップボードに貼り付けられたテキストを読み上げるアプリです。 |
|
Emerging |
| 1774 |
Aditya-ds-1806/dictpress-tts
TTS plugin for dictpress |
|
Emerging |
| 1775 |
MichalKacprzak99/jarvis
Jarvis is a personal voice assistant inspired by the Marvel movie series |
|
Emerging |
| 1776 |
solyarisoftware/CoquiSTTJs
Coqui STT offline engine API for NodeJs developers. With a simple HTTP ASR server. |
|
Emerging |
| 1777 |
openconcerto/MisterWhisper
Push to talk voice recognition using Whisper |
|
Emerging |
| 1778 |
hkilang/TTS
香港圍頭話及客家話文字轉語音朗讀器 |
|
Emerging |
| 1779 |
tristan-mcinnis/Simultaneous-Interpretation
Simultaneous-Interpretation is an advanced tool for real-time simultaneous... |
|
Emerging |
| 1780 |
naplab/AAD-MovingSpeakers
End-to-end system that leverages brain signals to control a binaural speech... |
|
Emerging |
| 1781 |
nssharmaofficial/reddit-hole
Automated reddit scraper and video creator |
|
Emerging |
| 1782 |
opsdroid/opsdroid-audio
🗣 A companion application for opsdroid which adds hotwords, speech... |
|
Emerging |
| 1783 |
KBM415/expo-speech-transcriber
🔊 Enable on-device speech transcription for Expo apps with real-time... |
|
Emerging |
| 1784 |
wxkingstar/TransEcho
macOS 实时同声传译 - 捕获系统音频,实时翻译字幕 + 语音同传 | Real-time simultaneous interpretation for macOS |
|
Emerging |
| 1785 |
d-j-e/SNPPar
Parallel/Homoplasic SNP Finder |
|
Emerging |
| 1786 |
gorkemkaramolla/whisper-run
Faster Whisper with Speaker Diarization |
|
Emerging |
| 1787 |
mrf345/flask_gtts
A Flask extension to add gTTS Google text to speech |
|
Emerging |
| 1788 |
botbahlul/crx-live-translate
Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video... |
|
Emerging |
| 1789 |
boudhayan-dev/Blind-Reader-project
A low cost reading device for blind people. |
|
Emerging |
| 1790 |
xhuvom/omnilingual-ASR-Web-Dashboard
Meta Omnilingual ASR web based dashboard for testing and API based... |
|
Emerging |
| 1791 |
wattyven/Live-Stream-TL
A real-time translation application that uses Vosk and the OpenAI API, with... |
|
Emerging |
| 1792 |
wspr-ncsu/robocall-audio-dataset
A dataset of real-world robocall audio recordings |
|
Emerging |
| 1793 |
rishikksh20/UnivNet-pytorch
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators... |
|
Emerging |
| 1794 |
bensonruan/Speech-Command
Speech Command Recognizer using tensorflowjs |
|
Emerging |
| 1795 |
shreyanspagariya/sankshep
Video Summarization - Summarized a video lecture and converted it to a... |
|
Emerging |
| 1796 |
jianchang512/realtime-stt
一个极简的本地离线实时语音转文字工具 |
|
Emerging |
| 1797 |
jingangdidi/voice_clone
An OpenVoice-based voice cloning tool, single executable file (~14M),... |
|
Emerging |
| 1798 |
hug33k/PyTalk-R2D2
Python script for R2D2 text-to-speech |
|
Emerging |
| 1799 |
umutciftci/mp3totext
Convert audio file to text |
|
Emerging |
| 1800 |
overcrash66/Audio-File-Translator---S2ST
Audio file translator is a multilingual speech to speech and speech to text... |
|
Emerging |