bespokelabsai/curator
Synthetic data curation for post-training and structured data extraction
This tool helps machine learning engineers and data scientists efficiently create high-quality synthetic datasets. It takes raw, unstructured information and generates structured data suitable for training or fine-tuning AI models. You can also use it to extract specific structured information from large volumes of text.
1,643 stars. Actively maintained with 9 commits in the last 30 days.
Use this if you need to quickly generate diverse, structured synthetic data to train or enhance your large language models, or if you need to extract specific details from unstructured text at scale.
Not ideal if you're looking for a simple data labeling solution or don't work with AI models requiring synthetic data for training and evaluation.
Stars
1,643
Forks
136
Language
Python
License
Apache-2.0
Category
Last pushed
Jan 24, 2026
Commits (30d)
9
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/bespokelabsai/curator"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.