gersteinlab/Struc-Bench
[NAACL 2024] Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? https://aclanthology.org/2024.naacl-short.2/
This project helps researchers and developers evaluate how well large language models can generate complex, structured data in various formats. You provide test data in JSON, and it produces generated outputs (tables, HTML, or LaTeX) along with scores indicating generation quality. It's for anyone working with Large Language Models who needs to benchmark their ability to create structured text.
No commits in the last 6 months.
Use this if you are a machine learning researcher or developer evaluating the performance of Large Language Models (LLMs) in generating structured tabular data.
Not ideal if you are looking for a tool to generate production-ready structured data directly, as this is primarily an evaluation and benchmarking framework.
Stars
55
Forks
7
Language
Python
License
—
Category
Last pushed
Jul 31, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/gersteinlab/Struc-Bench"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ExtensityAI/symbolicai
A neurosymbolic perspective on LLMs
TIGER-AI-Lab/MMLU-Pro
The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding...
deep-symbolic-mathematics/LLM-SR
[ICLR 2025 Oral] This is the official repo for the paper "LLM-SR" on Scientific Equation...
microsoft/interwhen
A framework for verifiable reasoning with language models.
zhudotexe/fanoutqa
Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language...