hitz-zentroa/whisper-lm-transformers
Add n-gram and LLM language model support to HF Transformers Whisper models.
This project helps improve the accuracy of speech-to-text transcription, especially for languages with fewer data resources. It takes an audio file and a pre-trained Whisper model, along with a custom language model (either n-gram or a large language model), and outputs more accurate text transcripts. Researchers, linguists, and engineers working with automatic speech recognition, particularly for less common languages, would find this tool useful.
No commits in the last 6 months. Available on PyPI.
Use this if you need to enhance the transcription quality of an existing Whisper speech-to-text model, especially for specific languages or domains where standard models might struggle.
Not ideal if you simply need basic speech-to-text transcription without needing to fine-tune or improve accuracy with custom language models.
Stars
14
Forks
2
Language
Python
License
Apache-2.0
Category
Last pushed
May 06, 2025
Commits (30d)
0
Dependencies
8
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/hitz-zentroa/whisper-lm-transformers"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
AI-Hypercomputer/maxtext
A simple, performant and scalable Jax LLM!
rasbt/reasoning-from-scratch
Implement a reasoning LLM in PyTorch from scratch, step by step
mindspore-lab/mindnlp
MindSpore + 🤗Huggingface: Run any Transformers/Diffusers model on MindSpore with seamless...
mosaicml/llm-foundry
LLM training code for Databricks foundation models
rickiepark/llm-from-scratch
<밑바닥부터 만들면서 공부하는 LLM>(길벗, 2025)의 코드 저장소