abhilash1910/ClusterTransformer

Topic clustering library built on Transformer embeddings and cosine similarity metrics.Compatible with all BERT base transformers from huggingface.

51
/ 100
Established

This tool helps data scientists and NLP practitioners organize unstructured text data into meaningful groups. You input a list of sentences, and it outputs a structured dataset (a dataframe) that assigns each sentence to a specific topic or cluster. This is ideal for anyone who needs to identify underlying themes in large collections of text, like customer feedback or research papers.

No commits in the last 6 months. Available on PyPI.

Use this if you need to automatically categorize or find common themes within a collection of text data, without manually defining the categories beforehand.

Not ideal if you already have predefined categories for your text and simply need to classify new texts into those existing labels.

Natural Language Processing Text Analytics Data Science Topic Modeling Information Retrieval
Stale 6m No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 25 / 25
Community 18 / 25

How are scores calculated?

Stars

44

Forks

15

Language

Python

License

Last pushed

Jun 11, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/abhilash1910/ClusterTransformer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.