Llm Data Labeling Transformer Models

There are 11 llm data labeling models tracked. 1 score above 50 (established tier). The highest-rated is allenai/dolma at 63/100 with 1,447 stars.

Get all 11 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-data-labeling&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 allenai/dolma

Data and tools for generating and inspecting OLMo pre-training data.

63
Established
2 waikato-llm/llm-dataset-converter

For converting LLM datasets from one format into another.

45
Emerging
3 refuel-ai/autolabel

Label, clean and enrich text datasets with LLMs.

44
Emerging
4 niclasgriesshaber/llm_patent_pipeline

LLMs for Historical Dataset Construction from Archival Image Scans

36
Emerging
5 cgxjdzz/FeatureForge-LLM

FeatureForge LLM is a Python package that leverages large language models...

32
Emerging
6 codeastra2/llm-feat

Automated feature engineering using Large Language Models (LLMs) for tabular data

31
Emerging
7 ywn7/llm-data-normalization-pattern

🔧 Normalize data intelligently with this serverless pattern leveraging LLMs,...

21
Experimental
8 qubasehq/qudata

A comprehensive LLM data processing system designed to transform raw...

19
Experimental
9 waikato-llm/llm-dataset-converter-all

Meta-library that combines all llm-dataset-converter libraries.

17
Experimental
10 gitEricsson/NuCore

This model converts diverse risk registers (Excel and PDF formats) into...

16
Experimental
11 kpmainali/species-knowledge-base

Multi-LLM pipeline for validated extraction and structuring of species-level...

13
Experimental