khanld/chunkformer

ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription

63
/ 100
Established

This project helps audio engineers, researchers, and transcription service providers efficiently convert very long audio recordings into text. It takes audio files, some as long as 16 hours, and outputs accurate written transcripts, even providing timestamps. It's designed for users who need to process extensive speech data without requiring high-end graphics cards.

Available on PyPI.

Use this if you need to transcribe extremely long audio files (up to 16 hours) on GPUs with limited memory, or process large batches of audio efficiently.

Not ideal if your primary need is real-time, ultra-low-latency transcription for short audio clips or if you specifically require a web-based, no-code solution.

speech-to-text audio-transcription long-form-audio voice-processing linguistics-research
Maintenance 10 / 25
Adoption 9 / 25
Maturity 25 / 25
Community 19 / 25

How are scores calculated?

Stars

78

Forks

21

Language

Python

License

Last pushed

Feb 13, 2026

Commits (30d)

0

Dependencies

18

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/khanld/chunkformer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.