kyo-takano/chinchilla
A toolkit for scaling law research ⚖
This toolkit helps deep learning researchers and practitioners efficiently scale the training of large models like LLMs or Vision Transformers. You provide data from multiple model training runs (compute, parameters, data, and loss), and it outputs optimized configurations for model size and data usage to achieve the best performance within a specific compute budget. It's for anyone pushing the boundaries of large-scale deep learning model development.
No commits in the last 6 months. Available on PyPI.
Use this if you are developing large deep learning models and need to optimize your compute resources to achieve the best possible model performance.
Not ideal if you are working with fine-tuning tasks or in domains with scarce data.
Stars
57
Forks
4
Language
Python
License
Apache-2.0
Category
Last pushed
Jan 27, 2025
Commits (30d)
0
Dependencies
8
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/kyo-takano/chinchilla"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Tencent/AngelSlim
Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.
nebuly-ai/optimate
A collection of libraries to optimise AI model performances
antgroup/glake
GLake: optimizing GPU memory management and IO transmission.
liyucheng09/Selective_Context
Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40%...
TsingmaoAI/MI-optimize
mi-optimize is a versatile tool designed for the quantization and evaluation of large language...