All Voice AI Tools

8,165 tools ranked by quality score · Page 18 of 82

Showing 1701–1800 of 8,165
# Tool Score Tier
1701 dennisbergevin/cypress-voice-plugin

Cypress plugin to announce spec result and time in Cypress Test Runner

39
Emerging
1702 spokestack/spokestack-python

Spokestack is a library that allows a user to easily incorporate a voice...

39
Emerging
1703 supikiti/PNCC

A implementation of Power Normalized Cepstral Coefficients: PNCC

39
Emerging
1704 Speaker-Identification/You-Only-Speak-Once

Deep Learning - one shot learning for speaker recognition using Filter Banks

39
Emerging
1705 DrewThomasson/ebook2audiobookSTYLETTS2

This simple program makes use of Calibre to convert a ebook into chapters...

39
Emerging
1706 atpritam/Free-Scribe

ML-integrated transcription & translation react app · Next.js 15 + React 19...

39
Emerging
1707 wahyd4/say-it

TTS in command line -- Pronounce the Chinese and English words you typed in.

39
Emerging
1708 stefantaubert/pronunciation-dictionary-utils

Utils to modify pronunciation dictionaries.

39
Emerging
1709 Harium/espeak-java

espeak java wrapper

39
Emerging
1710 abdozmantar/ComfyUI-DeepExtractV2

DeepExtractV2 – lightning-fast, high-quality audio separator. Instantly...

39
Emerging
1711 cameronking4/openai-realtime-blocks

Voice AI components using OpenAI Realtime API to copy and paste into your...

39
Emerging
1712 ziligy/watson-text-talker

Simple python Text-to-Speech Interface using IBM's Watson TTS

39
Emerging
1713 lang-uk/ukrainian-tts-preprocessing

Tools and models for Ukrainian phonemization and lexical stress prediction

39
Emerging
1714 deepgram-starters/csharp-voice-agent

Get started using Deepgram's Voice Agent with this C# demo app

39
Emerging
1715 srinivr/kaldi-long-audio-alignment

Long audio alignment using Kaldi

39
Emerging
1716 sovse/Rus-SpeechRecognition-LSTM-CTC-VoxForge

Распознавание речи русского языка используя Tensorflow, обучаясь на базе Voxforge

39
Emerging
1717 LuckyHookin/edge-TTS-record

一个可以录制 Microsoft Edge 浏览器的语音合成(TTS)语音并输出为 .wav 音频的(windows平台)工具。

39
Emerging
1718 GmEsoft/SP0256_CTS256A-AL2

G.I./Microchip SP0256 Speech Processor and CTS256A-AL2 Text-To-Speech...

39
Emerging
1719 balisujohn/tortoise.cpp

A ggml (C++) re-implementation of tortoise-tts

39
Emerging
1720 elbruno/ElBruno.Realtime

Pluggable real-time audio conversation framework for .NET. Local VAD, STT,...

39
Emerging
1721 tariqjamel/Flutter-Chat-Bot

A Flutter-based AI chatbot that allows interaction through text, voice, and...

39
Emerging
1722 JollyToday/GhostCut-auto_video_translation

auto video translation-video translator can auto translate video hard...

39
Emerging
1723 keonlee9420/DailyTalk

Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational...

39
Emerging
1724 playht/text-to-speech-api

Play.ht's Text to Speech API

39
Emerging
1725 nishanth-kj/VoxLabs

Text to Speech

39
Emerging
1726 jhubbardsf/svelte-speech-recognition

Speech recognition library for Svelte

39
Emerging
1727 nature-heart-software/izabela

Your speech assistant. Communicate with text-to-speech in games, on voice...

39
Emerging
1728 AkshathRaghav/tinyspeech

Code release for "TinySpeech: Attention Condensers for Deep Speech...

39
Emerging
1729 HarunoriKawano/Wav2vec2.0

Implementation of the paper "wav2vec 2.0: A Framework for Self-Supervised...

39
Emerging
1730 AudioLLMs/AudioBench

AudioBench: A Universal Benchmark for Audio Large Language Models

39
Emerging
1731 muhammadGagah/native-speech-generation

Add-on NVDA untuk mengubah teks menjadi suara alami dengan Google Gemini AI.

39
Emerging
1732 jamditis/audiobash

Voice-controlled terminal for developers. Speak commands, execute instantly.

39
Emerging
1733 john-carroll-sw/coffee-chat-voice-assistant

Coffee Chat Voice Assistant is a voice-driven ordering system powered by...

39
Emerging
1734 tugstugi/pytorch-speech-commands

Speech commands recognition with PyTorch | Kaggle 10th place solution in...

39
Emerging
1735 chrisurf/obsidian-voice

🔊 The Obsidian Voice plugin lets you listen to your written content being...

39
Emerging
1736 MiniMax-AI/MiniMax-AI.github.io

The official GitHub Page for MiniMax

39
Emerging
1737 USStateDept/State-TalentMAP-API

Source Code - https://github.com/USStateDept/State-TalentMAP

39
Emerging
1738 zthxxx/python-Speech_Recognition

A simple example for use speech recognition baidu api with python.

39
Emerging
1739 samuelbradshaw/text-to-timestamps

Python and command-line utility for aligning audio to a transcript.

39
Emerging
1740 mayeaux/generate-subtitles

Generate transcripts for audio and video content with a user friendly UI,...

39
Emerging
1741 nihui/ncnn-android-piper

ncnn android piper the fast and local neural text-to-speech engine

39
Emerging
1742 analyticsinmotion/decibri-web

Cross-browser microphone capture for the web. Zero dependencies.

39
Emerging
1743 phineas-pta/fine-tune-whisper-vi

jupyter notebooks to fine tune whisper models on Vietnamese using Colab...

39
Emerging
1744 alessandroragano/scoreq

SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)

39
Emerging
1745 TranscribeJs/transcribe.js

Monorepo for Transcribe.js

39
Emerging
1746 google-research-datasets/TextNormalizationCoveringGrammars

Covering grammars for English and Russian text normalization

39
Emerging
1747 zakuro-ai/asr

ASRDeepspeech x Sakura-ML (English/Japanese) with deepspeech2 model in...

39
Emerging
1748 jcrodriguez1989/heyshiny

Package: New `shiny` input that translates audio to text

39
Emerging
1749 appurist/say2file

This utility uses either ElevenLabs or IBM's Watson AI text-to-speech API to...

39
Emerging
1750 georgesterpu/Taris

Transformer-based online speech recognition system with TensorFlow 2

39
Emerging
1751 cvqluu/TDNN

Time delay neural network (TDNN) implementation in Pytorch using unfold method

39
Emerging
1752 xenova/kokoro-web

ML-powered speech synthesis directly in your browser

39
Emerging
1753 susilnem/American-sign-Language

A CNN based human computer interface for American Sign Language recognition...

39
Emerging
1754 sglkc/tts-api

Free, minimal, unlimited*, CORS-friendly Google Translate Text to Speech API...

38
Emerging
1755 p-groarke/wsay

Windows "say"

38
Emerging
1756 saky-semicolon/Emotion-Aware-AI-Support-System

A smart AI-powered platform that detects emotions from student voice input,...

38
Emerging
1757 MarkParker5/STARK-PLACE

S.T.A.R.K. Platform Library and Community Extensions

38
Emerging
1758 OlivierMary/MySuperWhisper

A global voice dictation tool for Linux using local OpenAI Whisper. Fast,...

38
Emerging
1759 AshutoshDongare/convo

Open source voice bot for Humanoid Robots and virtual digital humans

38
Emerging
1760 jopedroliveira/speech_recog_uc

Speech processing ROS-package. Performs speech recognition and estimates the...

38
Emerging
1761 IBM/mic-sts-nlu-weather-tone-analyzer

# WARNING: This repository is no longer maintained :warning: > This...

38
Emerging
1762 HawkAaron/E2E-ASR

PyTorch Implementations for End-to-End Automatic Speech Recognition

38
Emerging
1763 IPS-LMU/transcription-portal

A portal that offers a transcription chain for multi upload and processing...

38
Emerging
1764 chenwr727/Stock-Insight-AI

Stock-Insight-AI 一键生成股票与期货分析视频

38
Emerging
1765 jhermann/kopfkino

Syntactic sugar sprinkled on top of MoviePy and AI components to allow...

38
Emerging
1766 MarkParker5/STARK

S.T.A.R.K. - Speech And Text Algorithmic Recognition Kit

38
Emerging
1767 lmangani/docker-rtpengine-speech

OpenSIPS + RTPEngine Recording + Speech Recognition in HEP

38
Emerging
1768 Troyanovsky/awesome-TTS-Colab

Collection of awesome TTS and voice cloning models to run with Google Colab

38
Emerging
1769 gokhaneraslan/tts-dataset-generator

With this tool you can create custom TTS dataset from video or audio.

38
Emerging
1770 oren-cohen/whatsmybitrate

Whatsmybitrate analyzes audio files for quality metrics such as bit rate,...

38
Emerging
1771 Sciss/SpeechRecognitionHMM

Exported from...

38
Emerging
1772 khanld/ASR-Wav2vec-Finetune

:zap: Finetune Wa2vec 2.0 For Speech Recognition

38
Emerging
1773 led-mirage/VoivoClip

VOICEVOXでクリップボードに貼り付けられたテキストを読み上げるアプリです。

38
Emerging
1774 Aditya-ds-1806/dictpress-tts

TTS plugin for dictpress

38
Emerging
1775 MichalKacprzak99/jarvis

Jarvis is a personal voice assistant inspired by the Marvel movie series

38
Emerging
1776 solyarisoftware/CoquiSTTJs

Coqui STT offline engine API for NodeJs developers. With a simple HTTP ASR server.

38
Emerging
1777 openconcerto/MisterWhisper

Push to talk voice recognition using Whisper

38
Emerging
1778 hkilang/TTS

香港圍頭話及客家話文字轉語音朗讀器

38
Emerging
1779 tristan-mcinnis/Simultaneous-Interpretation

Simultaneous-Interpretation is an advanced tool for real-time simultaneous...

38
Emerging
1780 naplab/AAD-MovingSpeakers

End-to-end system that leverages brain signals to control a binaural speech...

38
Emerging
1781 nssharmaofficial/reddit-hole

Automated reddit scraper and video creator

38
Emerging
1782 opsdroid/opsdroid-audio

🗣 A companion application for opsdroid which adds hotwords, speech...

38
Emerging
1783 KBM415/expo-speech-transcriber

🔊 Enable on-device speech transcription for Expo apps with real-time...

38
Emerging
1784 wxkingstar/TransEcho

macOS 实时同声传译 - 捕获系统音频,实时翻译字幕 + 语音同传 | Real-time simultaneous interpretation for macOS

38
Emerging
1785 d-j-e/SNPPar

Parallel/Homoplasic SNP Finder

38
Emerging
1786 gorkemkaramolla/whisper-run

Faster Whisper with Speaker Diarization

38
Emerging
1787 mrf345/flask_gtts

A Flask extension to add gTTS Google text to speech

38
Emerging
1788 botbahlul/crx-live-translate

Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video...

38
Emerging
1789 boudhayan-dev/Blind-Reader-project

A low cost reading device for blind people.

38
Emerging
1790 xhuvom/omnilingual-ASR-Web-Dashboard

Meta Omnilingual ASR web based dashboard for testing and API based...

38
Emerging
1791 wattyven/Live-Stream-TL

A real-time translation application that uses Vosk and the OpenAI API, with...

38
Emerging
1792 wspr-ncsu/robocall-audio-dataset

A dataset of real-world robocall audio recordings

38
Emerging
1793 rishikksh20/UnivNet-pytorch

UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators...

38
Emerging
1794 bensonruan/Speech-Command

Speech Command Recognizer using tensorflowjs

38
Emerging
1795 shreyanspagariya/sankshep

Video Summarization - Summarized a video lecture and converted it to a...

38
Emerging
1796 jianchang512/realtime-stt

一个极简的本地离线实时语音转文字工具

38
Emerging
1797 jingangdidi/voice_clone

An OpenVoice-based voice cloning tool, single executable file (~14M),...

38
Emerging
1798 hug33k/PyTalk-R2D2

Python script for R2D2 text-to-speech

38
Emerging
1799 umutciftci/mp3totext

Convert audio file to text

38
Emerging
1800 overcrash66/Audio-File-Translator---S2ST

Audio file translator is a multilingual speech to speech and speech to text...

38
Emerging
« Prev 1 2 3 16 17 18 19 20 80 81 82 Next »