PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
This toolkit helps you work with spoken language, allowing you to convert audio into written text, translate spoken English to Chinese, and generate natural-sounding speech from written text. It takes audio files or text as input and produces transcribed text, translated text, or synthetic speech. Anyone who needs to process or create speech, such as content creators, linguists, or call center managers, would find this useful.
12,556 stars. Actively maintained with 3 commits in the last 30 days. Available on PyPI.
Use this if you need to quickly transcribe audio, translate spoken content, or create realistic voiceovers from text.
Not ideal if your primary need is advanced audio editing, music production, or highly specialized sound analysis beyond speech processing.
Stars
12,556
Forks
1,956
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 16, 2026
Commits (30d)
3
Dependencies
50
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/PaddlePaddle/PaddleSpeech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Recent Releases
Compare
Related tools
k2-fsa/sherpa
Speech-to-text server framework with next-gen Kaldi
Picovoice/cheetah
On-device streaming speech-to-text engine powered by deep learning
yeyupiaoling/YeAudio
Python的音频工具
zaigie/FunSpeech
开箱即用的本地私有化部署语音服务,快速搭建FunASR与CosyVoice2/3后端
manyeyes/ManySpeech
AI Speech Solutions for Tasks such as ASR, Vocal Extraction, Accompaniment Extraction, Audio...