RF5/transfusion-asr

Transcribing Speech with Multinomial Diffusion, training code and models.

/ 100

Emerging

This project offers a powerful tool for converting spoken audio into written text. You provide audio files (like recordings of meetings or interviews), and it outputs a highly accurate text transcription. It is designed for researchers, data scientists, or anyone working with large volumes of speech data who needs to automatically generate transcripts.

No commits in the last 6 months.

Use this if you need to transcribe spoken language from audio files into text with high accuracy, especially for research or data analysis purposes.

Not ideal if you're looking for a simple, off-the-shelf transcription service without any technical setup or if your main goal is real-time transcription.

speech-to-text audio-transcription natural-language-processing data-labeling computational-linguistics

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

TensorSpeech/TensorFlowASR

:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....

dangvansam/viet-asr

VietASR - Vietnamese Automatic Speech Recognition

wenet-e2e/wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

xinjli/allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

srvk/eesen

The official repository of the Eesen project

Explore Voice AI Tools

All categories Trending Voice AI directory Insights