hitz-zentroa/lm-contamination

The LM Contamination Index is a manually created database of contamination evidences for LMs.

/ 100

Experimental

This helps NLP researchers and practitioners ensure the integrity of their Large Language Model (LLM) evaluations. You input the name of a dataset you are using to test an LLM, and it outputs whether there's evidence that the LLM might have already seen that dataset during its training. This is for anyone evaluating or developing LLMs who needs to avoid skewed performance metrics.

No commits in the last 6 months.

Use this if you are an NLP researcher or practitioner evaluating an LLM and want to check if your benchmark dataset might have contaminated the model's training, leading to unreliable performance results.

Not ideal if you are looking for a tool to prevent general data leakage or to analyze the training data of an LLM for purposes other than evaluating benchmark contamination.

NLP research LLM evaluation AI model benchmarking academic integrity dataset analysis

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 8 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

monarch-initiative/ontogpt

LLM-based ontological extraction tools, including SPIRES

weAIDB/awesome-data-llm

Official Repository of "LLM × DATA" Survey Paper

AXYZdong/AMchat

AM (Advanced Mathematics) Chat is a large language model that integrates advanced mathematical...

skywalker023/sodaverse

🥤🧑🏻‍🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with...

Y-Research-SBU/TimeSeriesScientist

Official Repository for TimeSeriesScientist

Explore LLM Tools

All categories Trending LLM Tool directory Insights