hirupert/sede

Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

42
/ 100
Emerging

This is a dataset for researchers developing systems that convert natural language questions into SQL queries. It provides over 12,000 real-world SQL queries from Stack Exchange Data Explorer, each paired with its natural language description. The dataset is designed for machine learning researchers working on advanced 'text-to-SQL' models.

104 stars. No commits in the last 6 months.

Use this if you are a machine learning researcher training or evaluating models that translate complex, real-world natural language questions into executable SQL queries.

Not ideal if you are looking for a tool to generate SQL from natural language without needing to train or develop your own underlying machine learning model.

natural-language-processing text-to-sql dataset-creation semantic-parsing machine-learning-research
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

104

Forks

17

Language

Jupyter Notebook

License

Apache-2.0

Last pushed

Jun 25, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/hirupert/sede"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.