MaartenGr/BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

/ 100

Verified

BERTopic helps you understand the main themes within a large collection of text documents. You provide a dataset of text, like customer reviews, news articles, or research abstracts, and it outputs a list of topics, each defined by a few key words, along with the documents belonging to them. This is ideal for data analysts, researchers, or anyone needing to quickly grasp the core subjects in unstructured text.

7,443 stars. Used by 5 other packages. Available on PyPI.

Use this if you need to automatically discover and interpret key themes and sub-topics from a body of text without extensive manual reading or prior knowledge of the subjects.

Not ideal if you're dealing with very small text datasets or if your primary need is for fine-grained sentiment analysis rather than broad topic discovery.

text-analysis market-research document-categorization content-discovery qualitative-analysis

Maintenance 10 / 25

Adoption 15 / 25

Maturity 25 / 25

Community 21 / 25

How are scores calculated?

Stars

7,443

Forks

882

Language

Python

License

MIT

Compare

BERTopic and turftopic

Related models

webis-de/small-text

Active Learning for Text Classification in Python

mead-ml/mead-baseline

Deep-Learning Model Exploration and Development for NLP

x-tabdeveloping/turftopic

Robust and fast topic models with sentence-transformers.

HumanSignal/label-studio-transformers

Label data using HuggingFace's transformers and automatically get a prediction service

hiyouga/Dual-Contrastive-Learning

Code for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation"

Explore Transformer Models

All categories Trending Transformer directory Insights