bloomberg/fast-noise-aware-topic-clustering

Research code and scripts used in the Silburt et al. (2021) EMNLP 2021 paper 'FANATIC: FAst Noise-Aware TopIc Clustering'

33
/ 100
Emerging

FANATIC helps you find hidden topics within large collections of text, even when your data is noisy or includes many irrelevant documents. It takes raw text data, like social media posts, and outputs clearly defined topic clusters, along with summaries of what each cluster contains. This is ideal for researchers or analysts who need to make sense of unstructured text.

No commits in the last 6 months.

Use this if you need to automatically group vast amounts of text into meaningful categories and identify noise or irrelevant content within your dataset.

Not ideal if you need an out-of-the-box solution with a graphical interface for general topic modeling without custom code.

text-analysis social-media-research unstructured-data content-categorization information-discovery
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 12 / 25

How are scores calculated?

Stars

11

Forks

2

Language

Python

License

Apache-2.0

Last pushed

Jul 06, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/bloomberg/fast-noise-aware-topic-clustering"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.