kmario23/KenLM-training

Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2

36
/ 100
Emerging

This project helps speech recognition and natural language processing engineers train an n-gram based language model using the KenLM toolkit. You provide a large text corpus, and it produces a language model file that can be used to score sentences based on their likelihood. This is primarily for engineers working on speech-to-text systems or other text prediction tasks.

116 stars. No commits in the last 6 months.

Use this if you need to create a custom language model from your own domain-specific text data for applications like speech recognition.

Not ideal if you are not a developer and are looking for a ready-to-use, pre-trained language model without any coding or command-line interaction.

Speech Recognition Natural Language Processing Text Analysis Machine Translation Computational Linguistics
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 18 / 25

How are scores calculated?

Stars

116

Forks

21

Language

License

Last pushed

May 20, 2019

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/kmario23/KenLM-training"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.