hitz-zentroa/lm-contamination

The LM Contamination Index is a manually created database of contamination evidences for LMs.

24
/ 100
Experimental

This helps NLP researchers and practitioners ensure the integrity of their Large Language Model (LLM) evaluations. You input the name of a dataset you are using to test an LLM, and it outputs whether there's evidence that the LLM might have already seen that dataset during its training. This is for anyone evaluating or developing LLMs who needs to avoid skewed performance metrics.

No commits in the last 6 months.

Use this if you are an NLP researcher or practitioner evaluating an LLM and want to check if your benchmark dataset might have contaminated the model's training, leading to unreliable performance results.

Not ideal if you are looking for a tool to prevent general data leakage or to analyze the training data of an LLM for purposes other than evaluating benchmark contamination.

NLP research LLM evaluation AI model benchmarking academic integrity dataset analysis
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 8 / 25
Community 7 / 25

How are scores calculated?

Stars

82

Forks

4

Language

Python

License

Last pushed

Apr 11, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/hitz-zentroa/lm-contamination"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.