huawei-noah/Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

/ 100

Emerging

This project offers advanced speech technology models, including Grad-TTS for creating natural-sounding speech from text, SPIRAL for self-supervised speech representation learning, and DiffVC for converting one person's voice to another while preserving content. It's designed for researchers and engineers working on speech synthesis and voice manipulation.

602 stars. No commits in the last 6 months.

Use this if you are a speech researcher or engineer developing new text-to-speech systems or voice conversion applications.

Not ideal if you need a ready-to-use speech application or an SDK for general audio processing tasks.

speech-synthesis voice-conversion audio-generation machine-learning-research speech-technology

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 25 / 25

How are scores calculated?

Stars

602

Forks

130

Language

Jupyter Notebook

License

—

Featured in

Things AI Won't Tell You About Building a Voice App Choosing a Voice AI Library in 2026: What's Actually Worth Building On

Higher-rated alternatives

Uberi/speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.

cmusphinx/pocketsphinx

A small speech recognizer

tensorflow/lingvo

Lingvo

modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models,...

PyThaiNLP/pythaiasr

Python Thai Automatic Speech Recognition

Explore Voice AI Tools

All categories Trending Voice AI directory Insights