GUNDAM-Labet/GUNDAM

GUNDAM is a data management system that prioritizes data using language models.

46
/ 100
Emerging

This tool helps data scientists and ML engineers manage large collections of text data used to train or fine-tune language models. It takes your existing text corpus and an associated language model, then intelligently identifies the most essential and informative data samples (a "golden plug-in set"). This golden set can then be used by demonstration retrievers to efficiently select high-quality examples for various language model tasks without sifting through all your data.

189 stars. No commits in the last 6 months.

Use this if you need to efficiently identify the most valuable text data samples for training or serving language models, especially when dealing with continually growing datasets.

Not ideal if your primary goal is to manage non-textual data or if you are not working with large language models.

data-curation language-model-training text-data-management machine-learning-engineering data-efficiency
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

189

Forks

32

Language

Python

License

Apache-2.0

Last pushed

Aug 02, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/GUNDAM-Labet/GUNDAM"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.