prigarg/Bigram-Language-Model-from-Scratch

A Bigram Language Model from scratch with no-smoothing and add-one smoothing. Outputs bigram counts, bigram probabilities and probability of test sentence.

/ 100

Experimental

This tool helps computational linguists, NLP students, or researchers understand how frequently word pairs appear in a large text and predict the likelihood of a sentence. You provide a body of text (your 'training corpus') and a sentence you want to analyze, and it outputs the counts and probabilities of word pairs, plus the overall probability of your test sentence.

No commits in the last 6 months.

Use this if you need to quickly calculate bigram statistics and sentence probabilities from a text corpus using either basic or 'add-one' smoothing techniques.

Not ideal if you require more advanced language modeling techniques beyond bigrams or need to handle very sparse data more robustly than 'add-one' smoothing allows.

computational-linguistics natural-language-processing text-analysis language-modeling corpus-linguistics

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 5 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

nlx-group/overlapy

Python package developed to evaluate textual overlap (N-Grams) between two volumes of text.

joshualoehr/ngram-language-model

Python implementation of an N-gram language model with Laplace smoothing and sentence generation.

MannarAmuthan/kural-gen

KuralGen generates Thirukkural for a given English sentence

phughesmcr/SimpleNGrams

The easiest way to get n-grams from strings!

SpydazWebAI-NLP/BasicLanguageModelling2023

Basic Language Models , Bag of Words, Ngram Models Etc NLP modelling and associated tasks

Explore NLP Tools

All categories Trending NLP directory Insights