Structured Data Inference NLP Tools
Datasets and benchmarks for NLI, table understanding, text-to-SQL, and instruction-following tasks involving structured or semi-structured data. Does NOT include general sentiment analysis, classification tasks without structured reasoning components, or commonsense knowledge resources without explicit inference evaluation.
There are 78 structured data inference tools tracked. The highest-rated is ymcui/cmrc2018 at 49/100 with 451 stars.
Get all 78 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=structured-data-inference&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
ymcui/cmrc2018
A Span-Extraction Dataset for Chinese Machine Reading Comprehension (CMRC 2018) |
|
Emerging |
| 2 |
princeton-nlp/DensePhrases
[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021:... |
|
Emerging |
| 3 |
thunlp/MultiRD
Code and data of the AAAI-20 paper "Multi-channel Reverse Dictionary Model" |
|
Emerging |
| 4 |
IndexFziQ/KMRC-Papers
A list of recent papers regarding knowledge-based machine reading comprehension. |
|
Emerging |
| 5 |
danqi/rc-cnn-dailymail
CNN/Daily Mail Reading Comprehension Task |
|
Emerging |
| 6 |
intfloat/SimKGC
ACL 2022, SimKGC: Simple Contrastive Knowledge Graph Completion with... |
|
Emerging |
| 7 |
declare-lab/CIDER
This repository contains the dataset and the pytorch implementations of the... |
|
Emerging |
| 8 |
ShiZhengyan/StepGame
[AAAI 2022] Dataset and pytorch codes for the paper titled "StepGame: A New... |
|
Emerging |
| 9 |
zjunlp/MKG_Analogy
[ICLR 2023] Multimodal Analogical Reasoning over Knowledge Graphs |
|
Emerging |
| 10 |
maastrichtlawtech/gdsr
🕸️ A graph-augmented dense statute retriever. (EACL 2023) |
|
Emerging |
| 11 |
shmsw25/AmbigQA
An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous... |
|
Emerging |
| 12 |
IndexFziQ/MSMARCO-MRC-Analysis
Analysis on the MS-MARCO leaderboard regarding the machine reading... |
|
Emerging |
| 13 |
GeekDream-x/IDOL
Repo for paper "IDOL: Indicator-oriented Logic Pre-training for Logical... |
|
Emerging |
| 14 |
utahnlp/knowledge_infotabs
Repository containing code for the NAACL 2021 paper (Incorporating External... |
|
Emerging |
| 15 |
yuweihao/reclor
Code for "ReClor: A Reading Comprehension Dataset Requiring Logical... |
|
Emerging |
| 16 |
XingLuxi/KMRC-Research-Archive
🗂 Research about Knowledge-based Machine Reading Comprehension |
|
Emerging |
| 17 |
phanxuanphucnd/Active-learning-in-NLP
Active learning in NLP |
|
Emerging |
| 18 |
FeiWang96/GTR
[SIGIR 2021] Retrieving Complex Tables with Multi-Granular Graph... |
|
Emerging |
| 19 |
webis-de/acl22-revisiting-uncertainty-based-query-strategies-for-active-learning-with-transformers
Revisiting Uncertainty-based Query Strategies for Active Learning with Transformers |
|
Emerging |
| 20 |
anshitag/memit_csk
Source repository for Editing Common Sense in Transformers (EMNLP 2023) |
|
Emerging |
| 21 |
amazon-science/pizza-semantic-parsing-dataset
The PIZZA dataset continues the exploration of task-oriented parsing by... |
|
Emerging |
| 22 |
marceljahnke/negative-cache
PyTorch Implementation of the Paper "Efficient Training of Retrieval Models... |
|
Emerging |
| 23 |
amazon-science/wqa-multi-sentence-inference
This repository contains code used for our Multi Sentence Inference NAACL'22 paper. |
|
Emerging |
| 24 |
ymcui/expmrc
ExpMRC: Explainability Evaluation for Machine Reading Comprehension |
|
Emerging |
| 25 |
sherlcok314159/ChineseMRC-Data
收集了目前为止中文领域的MRC抽取式数据集 |
|
Emerging |
| 26 |
thunlp/CokeBERT
CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced... |
|
Emerging |
| 27 |
acidAnn/semeval2022_task7_starter_kit
:bulb: Starter kit for SemEval 2022 Task 7: Identifying Plausible... |
|
Emerging |
| 28 |
humanlab/rare-class-AL
AL for rare class strategies compared in the paper "Transfer and Active... |
|
Emerging |
| 29 |
ict-bigdatalab/CorpusBrain
CIKM 2022: CorpusBrain: Pre-train a Generative Retrieval Model for... |
|
Emerging |
| 30 |
USSiamaboat/polytuplet-loss
A Reverse Approach to Training Reading Comprehension and Logical Reasoning Models |
|
Emerging |
| 31 |
ai-systems/tg2022task_premise_retrieval
TextGraphs Shared Task on Natural Language Premise Selection |
|
Emerging |
| 32 |
Jordy-VL/uncertainty-bench
Code repository for **Benchmarking Scalable Predictive Uncertainty in Text... |
|
Emerging |
| 33 |
Dibyakanti/AutoTNLI-code
This repository contains the official code for the paper : Realistic Data... |
|
Emerging |
| 34 |
psunlpgroup/XSemPLR
Data and code for ACL 2023 paper XSemPLR: Cross-Lingual Semantic Parsing in... |
|
Experimental |
| 35 |
testzer0/AmbiQT
Code and Assets for "Benchmarking and Improving Text-to-SQL Generation Under... |
|
Experimental |
| 36 |
pietrolesci/anchoral
This is the official PyTorch implementation for our NAACL 2024 paper:... |
|
Experimental |
| 37 |
ZeinabAghahadi/Syllogistic-Commonsense-Reasoning
Deductive Commonsense Reasoning |
|
Experimental |
| 38 |
krystalan/Multi-hopRC
:notebook_with_decorative_cover: notes for Multi-hop Reading Comprehension... |
|
Experimental |
| 39 |
minnesotanlp/infoVerse
Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for... |
|
Experimental |
| 40 |
Pzoom522/xANLG
Data and code for "Understanding Linearity of Cross-Lingual Word Embedding... |
|
Experimental |
| 41 |
cognitiveailab/tg2021task
Participant Kit for the TextGraphs-15 Shared Task on Explanation Regeneration |
|
Experimental |
| 42 |
INK-USC/RiddleSense
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic... |
|
Experimental |
| 43 |
phosseini/GisPy
GisPy: A Tool for Measuring Gist Inference Score in Text... |
|
Experimental |
| 44 |
THU-KEG/COPEN
The official code and dataset for EMNLP 2022 paper "COPEN: Probing... |
|
Experimental |
| 45 |
MultimodalGeo/GeoText-1652
An offical repo for ECCV 2024 Towards Natural Language-Guided Drones:... |
|
Experimental |
| 46 |
ZhengZixiang/MRCPapers
Worth-reading paper list and other awesome resources on Machine Reading... |
|
Experimental |
| 47 |
mariomeissner/AmbiNLI
This is the code for the paper "Embracing Ambiguity: Shifting the Training... |
|
Experimental |
| 48 |
MSR-LIT/Splash
Release of SPLASH: Dataset for semantic parse correction with natural... |
|
Experimental |
| 49 |
yul091/UnBED
Codebase for the ACL 2023 paper: "Uncertainty-Aware Bootstrap Learning for... |
|
Experimental |
| 50 |
rycolab/evidence-probing
Code and data for the ACL 2022 paper "Probing as Quantifying Inductive Bias". |
|
Experimental |
| 51 |
semeval-2026-kclarity/clarity
Code release for KCLarity at SemEval-2026 Task 6: Encoder and Zero-Shot... |
|
Experimental |
| 52 |
Advancing-Machine-Human-Reasoning-Lab/transformer-psychometrics
Code to reproduce experiments in our *SEM 2021 Paper |
|
Experimental |
| 53 |
Raising-hrx/MetGen
An implementation for MetGen: A Module-Based Entailment Tree Generation... |
|
Experimental |
| 54 |
maastrichtlawtech/fusion
🔗 Hybrid retrieval in the legal domain |
|
Experimental |
| 55 |
salesforce/FewXC
Official code and data release for Efficiently Aligned Cross-Lingual... |
|
Experimental |
| 56 |
megagonlabs/xatu
🕊️ Code and Data for XATU: A Fine-grained Instruction-based Benchmark for... |
|
Experimental |
| 57 |
nlp-waseda/dcsg-ja
Dialogue Commonsense Graph in Japanese |
|
Experimental |
| 58 |
megagonlabs/ambignlg
:dog: Data for AmbigNLG: Addressing Task Ambiguity in Instruction for NLG... |
|
Experimental |
| 59 |
naver/ms-marco-shift
A Fine-Grained Analysis of Distribution Shifts in MSMARCO (MS-Shift).... |
|
Experimental |
| 60 |
fajri91/discourse_probing
Discourse Probing of Pretrained Language Models. In Proceedings of NAACL 2021. |
|
Experimental |
| 61 |
Nativeatom/FRoG
Fuzzy reasoning of Generalized Quantifiers (EMNLP 2024) |
|
Experimental |
| 62 |
XInfoTabS/dataset
The Official dataset for "XINFOTABS: Evaluating Multilingual Tabular Natural... |
|
Experimental |
| 63 |
INK-USC/ER-Test
Code for ER-Test, accepted to the Findings of EMNLP 2022 |
|
Experimental |
| 64 |
amazon-science/resource-constrained-naturalized-semantic-parsing
This repository is made public for reproducibility of our recent work on... |
|
Experimental |
| 65 |
zhengyima/Anchors
Source code of CIKM2021 Paper 'Pre-training for Ad-hoc Retrieval: Hyperlink... |
|
Experimental |
| 66 |
LaVi-Lab/C2LEVA
[Findings of ACL 2025] "C2LEVA: Toward Comprehensive and Contamination-Free... |
|
Experimental |
| 67 |
gianluigilopardo/anchors_text_theory
Code for the paper "A Sea of Words: An In-Depth Analysis of Anchors for Text... |
|
Experimental |
| 68 |
IndexFziQ/IIE-NLP-Eyas-SemEval2021
Code of IIE-NLP-Eyas Team for ReCAM (Task 4) @SemEval2021... |
|
Experimental |
| 69 |
Nativeatom/PRESQUE
The repository for "Pragmatic Reasoning Unlocks Quantifier Semantics for... |
|
Experimental |
| 70 |
HKUST-KnowComp/atomic-conceptualization
Code and data for the paper Acquiring and Modelling Abstract Commonsense... |
|
Experimental |
| 71 |
dyan-dy/Baidu-LIC2021-MRC
models and codes for baiduAI LIC 2021 MRC tasks, based on paddlenlp |
|
Experimental |
| 72 |
collapseindex/ci-curation
CI-Guided Data Curation: Using prediction instability to detect label noise.... |
|
Experimental |
| 73 |
RishiHazra/Actively-reducing-redundancies-in-Active-Learning-for-Sequence-Tagging
Active Learning for sequence tagging |
|
Experimental |
| 74 |
Lizhecheng02/DRS
[ACL 2025] Repository for our paper "DRS: Deep Question Reformulation With... |
|
Experimental |
| 75 |
Info-Sync/InfoSync
Implementation of the semi-structured inference model in our ACL 2023 paper:... |
|
Experimental |
| 76 |
putmanmodel/putman-model-paper
Preprint + pseudocode for the PUTMAN Model (relational meaning graphs,... |
|
Experimental |
| 77 |
rbhubert/recall
Tool for the recovery of relevant information through classification in an... |
|
Experimental |
| 78 |
trailerAI/KoDPR
Korean Dense Passage Retrieval (KoDPR) |
|
Experimental |