PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

/ 100

Verified

This toolkit helps you work with spoken language, allowing you to convert audio into written text, translate spoken English to Chinese, and generate natural-sounding speech from written text. It takes audio files or text as input and produces transcribed text, translated text, or synthetic speech. Anyone who needs to process or create speech, such as content creators, linguists, or call center managers, would find this useful.

12,556 stars. Actively maintained with 3 commits in the last 30 days. Available on PyPI.

Use this if you need to quickly transcribe audio, translate spoken content, or create realistic voiceovers from text.

Not ideal if your primary need is advanced audio editing, music production, or highly specialized sound analysis beyond speech processing.

speech-to-text text-to-speech audio-translation voice-generation language-processing

Maintenance 16 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 23 / 25

How are scores calculated?

Stars

12,556

Forks

1,956

Language

Python

License

Apache-2.0

Recent Releases

r1.5.0 05 Mar 2025 r1.4.2 27 Jun 2024 r1.4.1 14 Apr 2023 r1.4.0 15 Mar 2023 r1.3.0 14 Dec 2022

Compare

PaddleSpeech and RapidASR

Related tools

k2-fsa/sherpa

Speech-to-text server framework with next-gen Kaldi

Picovoice/cheetah

On-device streaming speech-to-text engine powered by deep learning

yeyupiaoling/YeAudio

Python的音频工具

zaigie/FunSpeech

开箱即用的本地私有化部署语音服务，快速搭建FunASR与CosyVoice2/3后端

manyeyes/ManySpeech

AI Speech Solutions for Tasks such as ASR, Vocal Extraction, Accompaniment Extraction, Audio...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights