sean-chester/generalised-brown

C++ implementation of Generalised Brown clustering and python scripts for feature generation

/ 100

Experimental

This tool helps researchers and natural language processing practitioners group similar words together based on how they're used in text. You provide a text corpus, and it generates lists of words that belong in the same "cluster." This is useful for understanding word relationships or preparing data for other language models.

No commits in the last 6 months.

Use this if you need to create word clusters with flexible granularity, allowing you to choose how many clusters you want to generate from a pre-computed merge list.

Not ideal if you're looking for a simple, out-of-the-box solution that doesn't require compiling C++ code or running Python scripts via the command line.

natural-language-processing computational-linguistics text-analysis feature-engineering semantic-similarity

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 11 / 25

How are scores calculated?

Stars

Forks

Language

C++

License

—

Higher-rated alternatives

MIND-Lab/OCTIS

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models...

i-dot-ai/themefinder

A topic modelling Python package for analysing one-to-many question-answer data.

andifunke/topic-labeling

The project proposes a framework to apply topic models on a text-corpus and eventually topic...

bab2min/tomotopy

Python package of Tomoto, the Topic Modeling Tool

bobxwu/TopMost

A Topic Modeling System Toolkit (ACL 2024 Demo)

Explore NLP Tools

All categories Trending NLP directory Insights