davidsbatista/Snowball
Implementation with some extensions of the paper "Snowball: Extracting Relations from Large Plain-Text Collections" (Agichtein and Gravano, 2000)
This tool helps researchers, analysts, or anyone working with large collections of unstructured text to automatically find and extract specific relationships between entities. You provide raw text where organizations, locations, or people are identified, along with a few examples of the relationship you're looking for (e.g., 'Google is headquartered in Mountain View'). The output is a structured list of all identified relationships, like 'Medtronic is based in Minneapolis', along with a confidence score.
178 stars.
Use this if you need to systematically identify and extract a particular type of relationship from a vast amount of text without manually reading through everything.
Not ideal if you're dealing with structured data, only have a small amount of text, or need to extract a wide variety of relationship types without providing any initial examples.
Stars
178
Forks
39
Language
Python
License
GPL-3.0
Category
Last pushed
Mar 17, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/davidsbatista/Snowball"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
davidsbatista/BREDS
"Bootstrapping Relationship Extractors with Distributional Semantics" (Batista et al., 2015) in...
nicolay-r/AREkit
Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing...
plkmo/BERT-Relation-Extraction
PyTorch implementation for "Matching the Blanks: Distributional Similarity for Relation Learning" paper
thunlp/FewRel
A Large-Scale Few-Shot Relation Extraction Dataset
yuhaozhang/tacred-relation
PyTorch implementation of the position-aware attention model for relation extraction