Llm Data Labeling Transformer Models
There are 11 llm data labeling models tracked. 1 score above 50 (established tier). The highest-rated is allenai/dolma at 63/100 with 1,447 stars.
Get all 11 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-data-labeling&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
allenai/dolma
Data and tools for generating and inspecting OLMo pre-training data. |
|
Established |
| 2 |
waikato-llm/llm-dataset-converter
For converting LLM datasets from one format into another. |
|
Emerging |
| 3 |
refuel-ai/autolabel
Label, clean and enrich text datasets with LLMs. |
|
Emerging |
| 4 |
niclasgriesshaber/llm_patent_pipeline
LLMs for Historical Dataset Construction from Archival Image Scans |
|
Emerging |
| 5 |
cgxjdzz/FeatureForge-LLM
FeatureForge LLM is a Python package that leverages large language models... |
|
Emerging |
| 6 |
codeastra2/llm-feat
Automated feature engineering using Large Language Models (LLMs) for tabular data |
|
Emerging |
| 7 |
ywn7/llm-data-normalization-pattern
🔧 Normalize data intelligently with this serverless pattern leveraging LLMs,... |
|
Experimental |
| 8 |
qubasehq/qudata
A comprehensive LLM data processing system designed to transform raw... |
|
Experimental |
| 9 |
waikato-llm/llm-dataset-converter-all
Meta-library that combines all llm-dataset-converter libraries. |
|
Experimental |
| 10 |
gitEricsson/NuCore
This model converts diverse risk registers (Excel and PDF formats) into... |
|
Experimental |
| 11 |
kpmainali/species-knowledge-base
Multi-LLM pipeline for validated extraction and structuring of species-level... |
|
Experimental |