SpursGoZmy/Tabular-LLM
本项目旨在收集开源的表格智能任务数据集(比如表格问答、表格-文本生成等),将原始数据整理为指令微调格式的数据并微调LLM,进而增强LLM对于表格数据的理解,最终构建出专门面向表格智能任务的大型语言模型。
This project helps researchers and developers enhance the ability of large language models (LLMs) to understand and interact with tabular data. It collects various open-source datasets related to table-based tasks like question answering or text generation from tables. The project then processes this raw data into a format suitable for fine-tuning LLMs, resulting in specialized models that can better process complex table structures and answer questions about their content.
644 stars. No commits in the last 6 months.
Use this if you are a researcher or developer looking for high-quality, pre-processed datasets and fine-tuned models to improve LLMs' performance on tasks involving tables, especially for specific vertical domains.
Not ideal if you are an end-user simply looking for an off-the-shelf tool to analyze or generate content from tables without needing to delve into model training or data preparation.
Stars
644
Forks
44
Language
—
License
—
Category
Last pushed
Apr 22, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/SpursGoZmy/Tabular-LLM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
monarch-initiative/ontogpt
LLM-based ontological extraction tools, including SPIRES
weAIDB/awesome-data-llm
Official Repository of "LLM × DATA" Survey Paper
AXYZdong/AMchat
AM (Advanced Mathematics) Chat is a large language model that integrates advanced mathematical...
skywalker023/sodaverse
🥤🧑🏻🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with...
Y-Research-SBU/TimeSeriesScientist
Official Repository for TimeSeriesScientist