DFKI-NLP/MultiTACRED
[ACL23] This repository contains the code for our paper "MultiTACRED: A Multilingual Version of the TAC Relation Extraction Dataset"
This project helps natural language processing researchers and developers expand their relation extraction models to work across many languages. It takes the TACRED dataset, which contains English sentences with identified entities and relationships, and automatically translates it into 12 different languages using DeepL or Google Translate APIs. The output is a multilingual dataset suitable for training and evaluating language models.
No commits in the last 6 months.
Use this if you are developing or researching multilingual relation extraction models and need a robust, standardized dataset translated into various languages.
Not ideal if you are looking for a pre-trained multilingual model or a simple API to extract relations in different languages without needing to train your own models.
Stars
10
Forks
—
Language
Python
License
MIT
Category
Last pushed
Oct 16, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/DFKI-NLP/MultiTACRED"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
DerwenAI/pytextrank
Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
Tiiiger/bert_score
BERT score for text generation
BrikerMan/Kashgari
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for...
asyml/texar
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. ...
yohasebe/wp2txt
A command-line tool to extract plain text from Wikipedia dumps with category and section filtering