yangluo7/CAME

[ACL 2023] The official implementation of "CAME: Confidence-guided Adaptive Memory Optimization"

/ 100

Emerging

This tool helps machine learning engineers train large language models more efficiently. It takes your model's parameters and training data, and outputs an optimized model that converges quickly while using less memory than traditional methods. Data scientists and deep learning researchers working with substantial models like BERT, GPT-2, or Llama would find this beneficial.

No commits in the last 6 months.

Use this if you are training large language models and need to reduce memory consumption without sacrificing training speed.

Not ideal if you are working with smaller models or have ample computational resources, as the benefits of memory efficiency might not be as critical.

large-language-models deep-learning-training model-optimization natural-language-processing resource-management

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

ljleb/sd-mecha

Executable State Dict Recipes

SJTU-DENG-Lab/Discrete-Diffusion-Forcing

Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference

declare-lab/tango

A family of diffusion models for text-to-audio generation.

Li-Jinsong/DAEDAL

[ICLR 2026] Official repository of "Beyond Fixed: Training-Free Variable-Length Denoising for...

SalesforceAIResearch/CoDA

Salesforce AI Research's open diffusion language model

Explore Diffusion Models

All categories Trending Diffusion directory Insights