corl-team/lime

Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"

/ 100

Experimental

This project offers a method to improve the performance of large language models (LLMs) during training by making their Transformer architectures more efficient. It takes an existing Transformer model and optimizes how its layers process information, resulting in faster convergence and better language understanding. AI/ML researchers and engineers working on developing or fine-tuning advanced language models would use this.

No commits in the last 6 months.

Use this if you are a machine learning researcher or engineer aiming to enhance the training efficiency and performance of Transformer-based language models.

Not ideal if you are an end-user looking for a pre-trained language model or a tool for general data analysis, as this is a low-level optimization for model architecture.

large-language-models transformer-architecture deep-learning-optimization natural-language-processing ai-model-training

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 4 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

NX-AI/xlstm

Official repository of the xLSTM.

sinanuozdemir/oreilly-hands-on-gpt-llm

Mastering the Art of Scalable and Efficient AI Model Deployment

DashyDashOrg/pandas-llm

Pandas-LLM

wxhcore/bumblecore

An LLM training framework built from the ground up, featuring a custom BumbleBee architecture...

MiniMax-AI/MiniMax-01

The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model &...

Explore Transformer Models

All categories Trending Transformer directory Insights