huawei-noah/Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
This project offers advanced speech technology models, including Grad-TTS for creating natural-sounding speech from text, SPIRAL for self-supervised speech representation learning, and DiffVC for converting one person's voice to another while preserving content. It's designed for researchers and engineers working on speech synthesis and voice manipulation.
602 stars. No commits in the last 6 months.
Use this if you are a speech researcher or engineer developing new text-to-speech systems or voice conversion applications.
Not ideal if you need a ready-to-use speech application or an SDK for general audio processing tasks.
Stars
602
Forks
130
Language
Jupyter Notebook
License
—
Category
Last pushed
Sep 18, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/huawei-noah/Speech-Backbones"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
Uberi/speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
cmusphinx/pocketsphinx
A small speech recognizer
tensorflow/lingvo
Lingvo
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models,...
PyThaiNLP/pythaiasr
Python Thai Automatic Speech Recognition