Audio-WestlakeU/UMA-ASR
This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).
This project helps speech recognition engineers convert spoken audio into written text more accurately and efficiently. It takes in audio recordings or live speech streams and outputs transcribed text. This tool is designed for researchers and engineers working on improving automatic speech recognition (ASR) systems.
No commits in the last 6 months.
Use this if you are developing or evaluating new automatic speech recognition systems, particularly for Chinese datasets, and want to improve the accuracy and speed of transcriptions.
Not ideal if you are looking for an off-the-shelf ASR application for general use or if you are not familiar with customizing speech recognition frameworks like ESPnet.
Stars
35
Forks
6
Language
Shell
License
—
Category
Last pushed
Dec 17, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Audio-WestlakeU/UMA-ASR"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
Uberi/speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
cmusphinx/pocketsphinx
A small speech recognizer
tensorflow/lingvo
Lingvo
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models,...
PyThaiNLP/pythaiasr
Python Thai Automatic Speech Recognition