Glavin001/Data2AITextbook
🚀 Automatically convert unstructured data into a high-quality 'textbook' format, optimized for fine-tuning Large Language Models (LLMs)
This project helps you turn your messy, unorganized text documents, like speeches or internal blogs, into structured learning materials. It takes your raw content and generates lessons, exercises, and assessments in a 'textbook' format. This is for anyone who wants to train a custom AI model using their specific knowledge, making the AI smarter and more accurate on their unique data.
No commits in the last 6 months.
Use this if you have a lot of unstructured text and want to create a highly effective, specialized AI model that understands and generates information based on your unique expertise.
Not ideal if you're looking for a general-purpose AI or don't have a specific body of text you want an AI to master.
Stars
25
Forks
3
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Oct 15, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/Glavin001/Data2AITextbook"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
InternScience/GraphGen
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
timothepearce/synda
A CLI for generating synthetic data
rasinmuhammed/misata
High-performance open-source synthetic data engine. Uses LLMs for schema design and vectorized...
ziegler-ingo/CRAFT
[TACL, EMNLP 2025 Oral] Code, datasets, and checkpoints for the paper "CRAFT Your Dataset:...
ZhuLinsen/FastDatasets
A powerful tool for creating high-quality training datasets for Large Language Models...