luo-junyu/Awesome-Data-Efficient-LLM
A list of data-efficient and data-centric LLM (Large Language Model) papers. Our Survey Paper: Towards Efficient LLM Post Training: A Data-centric Perspective
This resource helps machine learning researchers and engineers discover efficient ways to train Large Language Models (LLMs). It provides a curated list of research papers on techniques for making LLMs better using less data. You'll find methods for selecting the most valuable data or generating synthetic data to improve LLM performance, reducing the time and computational resources needed for development.
No commits in the last 6 months.
Use this if you are a machine learning practitioner working with LLMs and want to find research-backed strategies to improve model performance and reduce training costs by optimizing your data usage.
Not ideal if you are looking for ready-to-use software, code implementations, or a general introduction to LLMs; this is a research paper collection.
Stars
52
Forks
4
Language
—
License
—
Category
Last pushed
Feb 19, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/luo-junyu/Awesome-Data-Efficient-LLM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
FairyFali/SLMs-Survey
Survey of Small Language Models from Penn State, ...
USC-FORTIS/AD-LLM
[ACL Findings 2025] A benchmark for anomaly detection using large language models. It supports...
swordlidev/Efficient-Multimodal-LLMs-Survey
Efficient Multimodal Large Language Models: A Survey
zabir-nabil/awesome-multilingual-large-language-models
A comprehensive collection of multilingual datasets and large language models, meticulously...
AIoT-MLSys-Lab/Efficient-LLMs-Survey
[TMLR 2024] Efficient Large Language Models: A Survey