MaartenGr/BERTopic
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
BERTopic helps you understand the main themes within a large collection of text documents. You provide a dataset of text, like customer reviews, news articles, or research abstracts, and it outputs a list of topics, each defined by a few key words, along with the documents belonging to them. This is ideal for data analysts, researchers, or anyone needing to quickly grasp the core subjects in unstructured text.
7,443 stars. Used by 5 other packages. Available on PyPI.
Use this if you need to automatically discover and interpret key themes and sub-topics from a body of text without extensive manual reading or prior knowledge of the subjects.
Not ideal if you're dealing with very small text datasets or if your primary need is for fine-grained sentiment analysis rather than broad topic discovery.
Stars
7,443
Forks
882
Language
Python
License
MIT
Category
Last pushed
Feb 20, 2026
Commits (30d)
0
Dependencies
9
Reverse dependents
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/MaartenGr/BERTopic"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related models
webis-de/small-text
Active Learning for Text Classification in Python
mead-ml/mead-baseline
Deep-Learning Model Exploration and Development for NLP
x-tabdeveloping/turftopic
Robust and fast topic models with sentence-transformers.
HumanSignal/label-studio-transformers
Label data using HuggingFace's transformers and automatically get a prediction service
hiyouga/Dual-Contrastive-Learning
Code for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation"