davidsbatista/Snowball

Implementation with some extensions of the paper "Snowball: Extracting Relations from Large Plain-Text Collections" (Agichtein and Gravano, 2000)

60
/ 100
Established

This tool helps researchers, analysts, or anyone working with large collections of unstructured text to automatically find and extract specific relationships between entities. You provide raw text where organizations, locations, or people are identified, along with a few examples of the relationship you're looking for (e.g., 'Google is headquartered in Mountain View'). The output is a structured list of all identified relationships, like 'Medtronic is based in Minneapolis', along with a confidence score.

178 stars.

Use this if you need to systematically identify and extract a particular type of relationship from a vast amount of text without manually reading through everything.

Not ideal if you're dealing with structured data, only have a small amount of text, or need to extract a wide variety of relationship types without providing any initial examples.

information-extraction text-analysis market-intelligence research-automation data-mining
No Package No Dependents
Maintenance 13 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 21 / 25

How are scores calculated?

Stars

178

Forks

39

Language

Python

License

GPL-3.0

Last pushed

Mar 17, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/davidsbatista/Snowball"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.