generall/ExtWikilinks
Extended Wikilinks dataset description
This dataset helps natural language processing researchers build more accurate named entity linking systems. It takes raw text sentences and provides enriched information, including part-of-speech tags, lemmas, parse tags, and additional entity links beyond what was originally available. Researchers who are developing and evaluating named entity linking or disambiguation models would use this data.
No commits in the last 6 months.
Use this if you need a large, pre-processed dataset of English sentences with detailed linguistic annotations and extended entity mentions to train or evaluate named entity linking algorithms.
Not ideal if you need a dataset focused on languages other than English or if you require fine-grained annotations for tasks other than named entity linking.
Stars
15
Forks
—
Language
Jupyter Notebook
License
—
Category
Last pushed
Apr 01, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/generall/ExtWikilinks"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
amazon-science/ReFinED
ReFinED is an efficient and accurate entity linking (EL) system.
MartinoMensio/spacy-dbpedia-spotlight
A spaCy wrapper for DBpedia Spotlight
SDM-TIB/falcon2.0
Falcon 2.0 is a joint entity and relation linking tool over Wikidata.
Lucaterre/spacyfishing
A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata
dbpedia-spotlight/dbpedia-spotlight
DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text.