DFKI-NLP/MultiTACRED

[ACL23] This repository contains the code for our paper "MultiTACRED: A Multilingual Version of the TAC Relation Extraction Dataset"

/ 100

Experimental

This project helps natural language processing researchers and developers expand their relation extraction models to work across many languages. It takes the TACRED dataset, which contains English sentences with identified entities and relationships, and automatically translates it into 12 different languages using DeepL or Google Translate APIs. The output is a multilingual dataset suitable for training and evaluating language models.

No commits in the last 6 months.

Use this if you are developing or researching multilingual relation extraction models and need a robust, standardized dataset translated into various languages.

Not ideal if you are looking for a pre-trained multilingual model or a simple API to extract relations in different languages without needing to train your own models.

natural-language-processing machine-translation relation-extraction multilingual-data linguistic-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Higher-rated alternatives

DerwenAI/pytextrank

Python implementation of TextRank algorithms ("textgraphs") for phrase extraction

Tiiiger/bert_score

BERT score for text generation

BrikerMan/Kashgari

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for...

asyml/texar

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. ...

yohasebe/wp2txt

A command-line tool to extract plain text from Wikipedia dumps with category and section filtering

Explore NLP Tools

All categories Trending NLP directory Insights