Audio-WestlakeU/UMA-ASR

This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).

/ 100

Emerging

This project helps speech recognition engineers convert spoken audio into written text more accurately and efficiently. It takes in audio recordings or live speech streams and outputs transcribed text. This tool is designed for researchers and engineers working on improving automatic speech recognition (ASR) systems.

No commits in the last 6 months.

Use this if you are developing or evaluating new automatic speech recognition systems, particularly for Chinese datasets, and want to improve the accuracy and speed of transcriptions.

Not ideal if you are looking for an off-the-shelf ASR application for general use or if you are not familiar with customizing speech recognition frameworks like ESPnet.

speech-to-text audio-transcription voice-recognition ASR-development natural-language-processing

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Shell

License

—

Featured in

Things AI Won't Tell You About Building a Voice App Choosing a Voice AI Library in 2026: What's Actually Worth Building On

Higher-rated alternatives

Uberi/speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.

cmusphinx/pocketsphinx

A small speech recognizer

tensorflow/lingvo

Lingvo

modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models,...

PyThaiNLP/pythaiasr

Python Thai Automatic Speech Recognition

Explore Voice AI Tools

All categories Trending Voice AI directory Insights