hirupert/sede
Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data
This is a dataset for researchers developing systems that convert natural language questions into SQL queries. It provides over 12,000 real-world SQL queries from Stack Exchange Data Explorer, each paired with its natural language description. The dataset is designed for machine learning researchers working on advanced 'text-to-SQL' models.
104 stars. No commits in the last 6 months.
Use this if you are a machine learning researcher training or evaluating models that translate complex, real-world natural language questions into executable SQL queries.
Not ideal if you are looking for a tool to generate SQL from natural language without needing to train or develop your own underlying machine learning model.
Stars
104
Forks
17
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Jun 25, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/hirupert/sede"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
chakki-works/sumeval
Well tested & Multi-language evaluation framework for text summarization.
zhang17173/Event-Extraction
基于法律裁判文书的事件抽取及其应用,包括数据的分词、词性标注、命名实体识别、事件要素抽取和判决结果预测等内容
wasiahmad/paraphrase_identification
Examine two sentences and determine whether they have the same meaning.
thuiar/TEXTOIR
TEXTOIR is the first opensource toolkit for text open intent recognition. (ACL 2021)
artitw/BERT_QA
Accelerating the development of question-answering systems based on BERT and TF 2.0