Audio-WestlakeU/UMA-ASR

This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).

30
/ 100
Emerging

This project helps speech recognition engineers convert spoken audio into written text more accurately and efficiently. It takes in audio recordings or live speech streams and outputs transcribed text. This tool is designed for researchers and engineers working on improving automatic speech recognition (ASR) systems.

No commits in the last 6 months.

Use this if you are developing or evaluating new automatic speech recognition systems, particularly for Chinese datasets, and want to improve the accuracy and speed of transcriptions.

Not ideal if you are looking for an off-the-shelf ASR application for general use or if you are not familiar with customizing speech recognition frameworks like ESPnet.

speech-to-text audio-transcription voice-recognition ASR-development natural-language-processing
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 15 / 25

How are scores calculated?

Stars

35

Forks

6

Language

Shell

License

Last pushed

Dec 17, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Audio-WestlakeU/UMA-ASR"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.