kaituoxu/Speech-Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
This tool helps developers working with Mandarin Chinese speech convert audio recordings directly into written text using an advanced neural network. It takes acoustic features derived from speech audio as input and outputs the corresponding character sequence. This is primarily for machine learning engineers or researchers focused on developing and evaluating automatic speech recognition (ASR) systems.
809 stars. No commits in the last 6 months.
Use this if you are a developer researching or building an end-to-end Automatic Speech Recognition (ASR) system for Mandarin Chinese and want to use a Transformer-based architecture.
Not ideal if you are looking for a ready-to-use application to transcribe Mandarin Chinese audio without needing to dive into machine learning model training and development.
Stars
809
Forks
196
Language
Python
License
—
Category
Last pushed
Apr 06, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/kaituoxu/Speech-Transformer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TensorSpeech/TensorFlowASR
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....
dangvansam/viet-asr
VietASR - Vietnamese Automatic Speech Recognition
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
xinjli/allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
srvk/eesen
The official repository of the Eesen project