liyucheng09/llm-compressive

Longitudinal Evaluation of LLMs via Data Compression

15
/ 100
Experimental

This project helps evaluate how well large language models (LLMs) adapt to new information over time and how robust they are across different kinds of data. You provide an LLM from Hugging Face Hub and a dataset (like Wikipedia articles, news, or code) spanning various time periods, and it outputs a compression rate trend, showing how the model's performance changes over time. AI researchers, machine learning engineers, or anyone deploying LLMs can use this to understand a model's long-term generalization capabilities.

No commits in the last 6 months.

Use this if you need to assess the generalization and robustness of an LLM over a timeline with diverse datasets.

Not ideal if you're looking for a tool to fine-tune LLMs or measure performance on specific downstream tasks like sentiment analysis or question answering.

LLM-evaluation model-robustness time-series-analysis AI-research model-generalization
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 0 / 25

How are scores calculated?

Stars

33

Forks

Language

Python

License

Last pushed

May 29, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/liyucheng09/llm-compressive"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.