hirupert/sede

Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

/ 100

Emerging

This is a dataset for researchers developing systems that convert natural language questions into SQL queries. It provides over 12,000 real-world SQL queries from Stack Exchange Data Explorer, each paired with its natural language description. The dataset is designed for machine learning researchers working on advanced 'text-to-SQL' models.

104 stars. No commits in the last 6 months.

Use this if you are a machine learning researcher training or evaluating models that translate complex, real-world natural language questions into executable SQL queries.

Not ideal if you are looking for a tool to generate SQL from natural language without needing to train or develop your own underlying machine learning model.

natural-language-processing text-to-sql dataset-creation semantic-parsing machine-learning-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

104

Forks

Language

Jupyter Notebook

License

Apache-2.0

Higher-rated alternatives

chakki-works/sumeval

Well tested & Multi-language evaluation framework for text summarization.

zhang17173/Event-Extraction

基于法律裁判文书的事件抽取及其应用，包括数据的分词、词性标注、命名实体识别、事件要素抽取和判决结果预测等内容

wasiahmad/paraphrase_identification

Examine two sentences and determine whether they have the same meaning.

thuiar/TEXTOIR

TEXTOIR is the first opensource toolkit for text open intent recognition. (ACL 2021)

artitw/BERT_QA

Accelerating the development of question-answering systems based on BERT and TF 2.0

Explore NLP Tools

All categories Trending NLP directory Insights