prigarg/Bigram-Language-Model-from-Scratch
A Bigram Language Model from scratch with no-smoothing and add-one smoothing. Outputs bigram counts, bigram probabilities and probability of test sentence.
This tool helps computational linguists, NLP students, or researchers understand how frequently word pairs appear in a large text and predict the likelihood of a sentence. You provide a body of text (your 'training corpus') and a sentence you want to analyze, and it outputs the counts and probabilities of word pairs, plus the overall probability of your test sentence.
No commits in the last 6 months.
Use this if you need to quickly calculate bigram statistics and sentence probabilities from a text corpus using either basic or 'add-one' smoothing techniques.
Not ideal if you require more advanced language modeling techniques beyond bigrams or need to handle very sparse data more robustly than 'add-one' smoothing allows.
Stars
15
Forks
1
Language
Jupyter Notebook
License
—
Category
Last pushed
Jan 12, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/prigarg/Bigram-Language-Model-from-Scratch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
nlx-group/overlapy
Python package developed to evaluate textual overlap (N-Grams) between two volumes of text.
joshualoehr/ngram-language-model
Python implementation of an N-gram language model with Laplace smoothing and sentence generation.
MannarAmuthan/kural-gen
KuralGen generates Thirukkural for a given English sentence
phughesmcr/SimpleNGrams
The easiest way to get n-grams from strings!
SpydazWebAI-NLP/BasicLanguageModelling2023
Basic Language Models , Bag of Words, Ngram Models Etc NLP modelling and associated tasks