khanld/chunkformer
ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription
This project helps audio engineers, researchers, and transcription service providers efficiently convert very long audio recordings into text. It takes audio files, some as long as 16 hours, and outputs accurate written transcripts, even providing timestamps. It's designed for users who need to process extensive speech data without requiring high-end graphics cards.
Available on PyPI.
Use this if you need to transcribe extremely long audio files (up to 16 hours) on GPUs with limited memory, or process large batches of audio efficiently.
Not ideal if your primary need is real-time, ultra-low-latency transcription for short audio clips or if you specifically require a web-based, no-code solution.
Stars
78
Forks
21
Language
Python
License
—
Category
Last pushed
Feb 13, 2026
Commits (30d)
0
Dependencies
18
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/khanld/chunkformer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
sooftware/conformer
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech...
upskyy/Squeezeformer
PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech...
WindQAQ/listen-attend-and-spell
Tensorflow implementation of "Listen, Attend and Spell" authored by William Chan. This project...
jackaduma/LAS_Mandarin_PyTorch
Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)
kaituoxu/Listen-Attend-Spell
A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.