yangluo7/CAME
[ACL 2023] The official implementation of "CAME: Confidence-guided Adaptive Memory Optimization"
This tool helps machine learning engineers train large language models more efficiently. It takes your model's parameters and training data, and outputs an optimized model that converges quickly while using less memory than traditional methods. Data scientists and deep learning researchers working with substantial models like BERT, GPT-2, or Llama would find this beneficial.
No commits in the last 6 months.
Use this if you are training large language models and need to reduce memory consumption without sacrificing training speed.
Not ideal if you are working with smaller models or have ample computational resources, as the benefits of memory efficiency might not be as critical.
Stars
96
Forks
9
Language
Python
License
MIT
Category
Last pushed
Mar 22, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/yangluo7/CAME"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ljleb/sd-mecha
Executable State Dict Recipes
SJTU-DENG-Lab/Discrete-Diffusion-Forcing
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference
declare-lab/tango
A family of diffusion models for text-to-audio generation.
Li-Jinsong/DAEDAL
[ICLR 2026] Official repository of "Beyond Fixed: Training-Free Variable-Length Denoising for...
SalesforceAIResearch/CoDA
Salesforce AI Research's open diffusion language model