All Voice AI Tools

8,165 tools ranked by quality score · Page 7 of 82

Showing 601–700 of 8,165
# Tool Score Tier
601 gooofy/py-espeak-ng

Some simple wrappers around eSpeak NG intended to make using this excellent...

48
Emerging
602 myshell-ai/MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support...

48
Emerging
603 artcore-c/AI-Voice-Clone-with-Coqui-XTTS-v2

Free voice cloning for creators using Coqui XTTS-v2 on Google Colab. Clone...

48
Emerging
604 solyarisoftware/voskJs

Vosk ASR offline engine API for NodeJs developers. With a simple HTTP ASR server.

48
Emerging
605 gooofy/zerovox

zero-shot realtime TTS system, fully offline, free and open source

48
Emerging
606 PraaneshSelvaraj/speech_engine

Speech Engine is a Python package that provides a simple interface for...

48
Emerging
607 andresayac/edge-tts

Edge TTS is a Node or Bun package that allows access to the online...

48
Emerging
608 lucasjinreal/Kokoros

🔥🔥 Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast,...

48
Emerging
609 devnen/Dia-TTS-Server

Self-host the powerful Dia TTS model. This server offers a user-friendly Web...

48
Emerging
610 hehehai/voxt

🎙️Voice input and translation app for macOS. Press to talk, release to paste.

48
Emerging
611 lucasnewman/best-rq-pytorch

Implementation of BEST-RQ - a model for self-supervised learning of speech...

48
Emerging
612 Kaljurand/dictate.js

A small Javascript library for browser-based real-time speech recognition,...

48
Emerging
613 filippogiruzzi/voice_activity_detection

Voice Activity Detection based on Deep Learning & TensorFlow

48
Emerging
614 KinglittleQ/GST-Tacotron

A PyTorch implementation of Style Tokens: Unsupervised Style Modeling,...

48
Emerging
615 abhirooptalasila/AutoSub

A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using...

48
Emerging
616 jpuigcerver/Laia

Laia: A deep learning toolkit for HTR based on Torch

48
Emerging
617 Mag1cFall/AIStudio2API

将AI Studio反代成OpenAI兼容的API | OpenAI-compatible API proxy for Google AI Studio

48
Emerging
618 feldberlin/timething

Timething is a library for aligning text transcripts with their audio recordings.

48
Emerging
619 fulldecent/vowel-practice

iOS application for finding formants in spoken sounds

48
Emerging
620 BoltzmannEntropy/xtts2-ui

A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech

48
Emerging
621 Atm4x/tts-with-rvc

TTS with RVC-module to generate .wav audios

48
Emerging
622 gooofy/zamia-speech

Open tools and data for cloudless automatic speech recognition

48
Emerging
623 pulijon/Sttcast

Transcription from mp3 files to html with or without embedded player

48
Emerging
624 shashank2122/Local-Voice

A real-time, offline voice assistant for Linux and Raspberry Pi. Uses local...

48
Emerging
625 MerlinCN/kinoko7danmaku

调用TTS来播报哔哩哔哩直播中的弹幕、礼物、舰长等

48
Emerging
626 yl4579/AuxiliaryASR

Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)

48
Emerging
627 blaisewf/rvc-cli

🚀 RVC + UVR = A perfect set of tools for voice cloning, easily and free!

48
Emerging
628 cboard-org/ccboard

Cordova wrapper for the Cboard application

48
Emerging
629 thepirat000/spleeter-api

Audio separation API using Spleeter from Deezer

48
Emerging
630 ivanvovk/durian-pytorch

Implementation of "Duration Informed Attention Network for Multimodal...

48
Emerging
631 mediatechlab/tts-wrapper

TTS-Wrapper makes it easier to use text-to-speech APIs by providing a...

48
Emerging
632 supertone-inc/supertonic-py

Lightning-Fast, On-Device TTS — running natively via ONNX.

48
Emerging
633 dngda/bot-whatsapp

Unmaintained - Multipurpose WhatsApp Bot 🤖 using open-wa/wa-automate-nodejs...

48
Emerging
634 mozilla/TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion...

48
Emerging
635 HiMeditator/auto-caption

A cross-platform real-time subtitle display software. 一个跨平台的实时字幕显示软件。

48
Emerging
636 OvidijusParsiunas/speech-to-element

A simple way to add speech to text functionality to your website :microphone:

48
Emerging
637 haolinwang819-boop/ai-video-generation-workflow

AI video generation workflow with script, slides, TTS, subtitles, and FFmpeg...

48
Emerging
638 XiaoMi/kaldi-onnx

Kaldi model converter to ONNX

48
Emerging
639 jxzhanggg/nonparaSeq2seqVC_code

Implementation code of non-parallel sequence-to-sequence VC

48
Emerging
640 SuyashMore/MevonAI-Speech-Emotion-Recognition

Identify the emotion of multiple speakers in an Audio Segment

48
Emerging
641 jbelford/Eolian

Eolian is a Discord music bot which provide a very powerful API for queuing...

48
Emerging
642 harry0703/AudioNotes

快速提取音视频内容,整理成一份结构化的markdown笔记

48
Emerging
643 gentaiscool/end2end-asr-pytorch

End-to-End Automatic Speech Recognition on PyTorch

48
Emerging
644 haoheliu/voicefixer_main

General Speech Restoration

48
Emerging
645 tsurumeso/vocal-remover

Vocal Remover using Deep Neural Networks

48
Emerging
646 upskyy/Squeezeformer

PyTorch implementation of "Squeezeformer: An Efficient Transformer for...

48
Emerging
647 PlayVoice/vits_chinese

Best practice TTS based on BERT and VITS with some Natural Speech Features...

48
Emerging
648 just-ai/aimybox-android-assistant

Embeddable custom voice assistant for Android applications

48
Emerging
649 scionoftech/DeepAsr

Keras(Tensorflow) implementations of Automatic Speech Recognition

48
Emerging
650 TUD-STKS/VocalTractLabBackend-dev

The VocalTractLab backend sources and C/C++ API

48
Emerging
651 AutoArk/GPA

[AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion...

48
Emerging
652 KKshitiz/J.A.R.V.I.S

Iron man inspired Personal virtual assistant

48
Emerging
653 Picovoice/cobra

On-device voice activity detection (VAD) powered by deep learning

48
Emerging
654 alesaccoia/VoiceStreamAI

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in...

48
Emerging
655 rolczynski/Automatic-Speech-Recognition

🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)

48
Emerging
656 Aivis-Project/AivisSpeech

AivisSpeech: AI Voice Imitation System - Text to Speech Software

48
Emerging
657 gexgd0419/NaturalVoiceSAPIAdapter

Make Azure natural TTS voices accessible to any SAPI 5-compatible application.

48
Emerging
658 alumae/kaldi-offline-transcriber

Offline transcription system for Estonian using Kaldi

48
Emerging
659 gitmylo/bark-voice-cloning-HuBERT-quantizer

The code for the bark-voicecloning model. Training and inference.

48
Emerging
660 rishikksh20/FastSpeech2

PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End...

48
Emerging
661 hegedustibor/htgo-tts

Text to speech package for Golang.

48
Emerging
662 YaoFANGUK/video-subtitle-extractor

视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI...

48
Emerging
663 clovaai/ClovaCall

ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)

48
Emerging
664 supersu-man/pyt2s

The Python Text to Speech library you've been looking for.

48
Emerging
665 WanderingAstronomer/Vociferous

Vociferous captures audio from your microphone, transcribes it in real-time...

48
Emerging
666 yl4579/PL-BERT

Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions

48
Emerging
667 philipperemy/tensorflow-ctc-speech-recognition

Application of Connectionist Temporal Classification (CTC) for Speech...

48
Emerging
668 Jaymon/transcribe

Convert images or audio files to plain text on the command line

48
Emerging
669 keenresearch/keenasr-ios-poc

Proof of concept app that demonstrates use of KeenASR SDK in ObjC. WE ARE...

48
Emerging
670 reriiasu/speech-to-text

Real-time transcription using faster-whisper

48
Emerging
671 openspeech-team/openspeech

Open-Source Toolkit for End-to-End Speech Recognition leveraging...

48
Emerging
672 moshehbenavraham/Voice-Agent-PuPuPlatter

Multi-provider voice AI showcase featuring 7 providers (ElevenLabs + Widget,...

48
Emerging
673 EnjiRouz/Voice-Assistant-App

Python Voice Assistant project can: recognize and synthesize speech without...

48
Emerging
674 alphacep/vosk-asterisk

Speech Recognition in Asterisk with Vosk Server

47
Emerging
675 itsRares/react-native-deepgram

Brings Deepgram's capabilities to React Native applications, with a focus on...

47
Emerging
676 LEEYOONHYUNG/BVAE-TTS

Official implementation of BVAE-TTS

47
Emerging
677 xkeyC/fl_caption

Offline real-time captioning software written in Flutter and Rust, powered...

47
Emerging
678 rishikksh20/VocGAN

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested...

47
Emerging
679 jiaqili3/DualCodec

[Interspeech 2025] DualCodec: A Low-Frame-Rate, Semantically-Enhanced Neural...

47
Emerging
680 yl4579/PitchExtractor

Deep Neural Pitch Extractor for Voice Conversion and TTS Training

47
Emerging
681 lmnt-com/wavegrad

A fast, high-quality neural vocoder.

47
Emerging
682 seanghay/speechviewer

A quick audio dataset viewer

47
Emerging
683 chenliangrui/EasyMrcp

欢迎使用EasyMrcp! EasyMrcp使用java编写,目前提供了多种不同的asr和tts的集成,做到真正简单使用ASR和TTS。...

47
Emerging
684 Deepest-Project/MelNet

Implementation of "MelNet: A Generative Model for Audio in the Frequency Domain"

47
Emerging
685 elimu-ai/vitabu

📚 Android application for reading storybooks and expanding word vocabulary.

47
Emerging
686 YoavRamon/awesome-kaldi

This is a list of features, scripts, blogs and resources for better using...

47
Emerging
687 IS2AI/Kazakh_TTS

An expanded version of the previously released Kazakh text-to-speech...

47
Emerging
688 agent87/RW-DEEPSPEECH-API

An end to end deep speech REST API containing speech to text and text speech...

47
Emerging
689 alexruperez/SpeechRecognizerButton

UIButton subclass with push to talk recording, speech recognition and...

47
Emerging
690 saiteja-talluri/Speech2Face

Implementation of the CVPR 2019 Paper - Speech2Face: Learning the Face...

47
Emerging
691 modelscope/KAN-TTS

KAN-TTS is a speech-synthesis training framework, please try the demos we...

47
Emerging
692 symblai/speech-recognition-evaluation

Evaluate results from ASR/Speech-to-Text quickly

47
Emerging
693 Gautham495/react-native-speech-recognition-kit

React Native Turbo Module to access Speech Recognition in Android & iOS

47
Emerging
694 AppDevGuy/OSSSpeechKit

OSSSpeechKit offers a native iOS Speech wrapper for AVFoundation and Apple's Speech.

47
Emerging
695 cvqluu/Factorized-TDNN

PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal...

47
Emerging
696 bensonruan/Chrome-Web-Speech-API

Chrome Web Speech API

47
Emerging
697 dspavankumar/keras-kaldi

Keras Interface for Kaldi ASR

47
Emerging
698 travisvn/obsidian-edge-tts

Free, high quality text-to-speech for your Obsidian notes, leveraging...

47
Emerging
699 roatienza/efficientspeech

PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.

47
Emerging
700 512z/podlens

Free Podwise: AI Podcast & Youtube Transcription & Understanding Agent |...

47
Emerging
« Prev 1 2 3 5 6 7 8 9 80 81 82 Next »