khanld/chunkformer

ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription

/ 100

Established

This project helps audio engineers, researchers, and transcription service providers efficiently convert very long audio recordings into text. It takes audio files, some as long as 16 hours, and outputs accurate written transcripts, even providing timestamps. It's designed for users who need to process extensive speech data without requiring high-end graphics cards.

Available on PyPI.

Use this if you need to transcribe extremely long audio files (up to 16 hours) on GPUs with limited memory, or process large batches of audio efficiently.

Not ideal if your primary need is real-time, ultra-low-latency transcription for short audio clips or if you specifically require a web-based, no-code solution.

speech-to-text audio-transcription long-form-audio voice-processing linguistics-research

Maintenance 10 / 25

Adoption 9 / 25

Maturity 25 / 25

Community 19 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Related tools

sooftware/conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech...

upskyy/Squeezeformer

PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech...

WindQAQ/listen-attend-and-spell

Tensorflow implementation of "Listen, Attend and Spell" authored by William Chan. This project...

jackaduma/LAS_Mandarin_PyTorch

Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)

kaituoxu/Listen-Attend-Spell

A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.

Explore Voice AI Tools

All categories Trending Voice AI directory Insights