kaituoxu/Speech-Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

/ 100

Emerging

This tool helps developers working with Mandarin Chinese speech convert audio recordings directly into written text using an advanced neural network. It takes acoustic features derived from speech audio as input and outputs the corresponding character sequence. This is primarily for machine learning engineers or researchers focused on developing and evaluating automatic speech recognition (ASR) systems.

809 stars. No commits in the last 6 months.

Use this if you are a developer researching or building an end-to-end Automatic Speech Recognition (ASR) system for Mandarin Chinese and want to use a Transformer-based architecture.

Not ideal if you are looking for a ready-to-use application to transcribe Mandarin Chinese audio without needing to dive into machine learning model training and development.

automatic-speech-recognition mandarin-chinese natural-language-processing deep-learning-research audio-transcription

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 25 / 25

How are scores calculated?

Stars

809

Forks

196

Language

Python

License

—

Higher-rated alternatives

TensorSpeech/TensorFlowASR

:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....

dangvansam/viet-asr

VietASR - Vietnamese Automatic Speech Recognition

wenet-e2e/wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

xinjli/allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

srvk/eesen

The official repository of the Eesen project

Explore Voice AI Tools

All categories Trending Voice AI directory Insights