kyegomez/SelfExtend
Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta
This project helps machine learning engineers and researchers expand the context window of large language models (LLMs) without needing to retrain them. It takes standard query, key, and value tensors along with positional indices, and outputs an attention tensor that effectively handles longer sequences. This allows LLMs to process and generate much longer texts or code.
No commits in the last 6 months. Available on PyPI.
Use this if you are working with large language models and need to process very long inputs or generate extended outputs, but want to avoid the computational cost and time of fine-tuning the entire model.
Not ideal if you need to fundamentally change the underlying architecture of your LLM, or if your primary goal is to significantly reduce inference latency for short sequences.
Stars
13
Forks
—
Language
Python
License
MIT
Category
Last pushed
Nov 11, 2024
Commits (30d)
0
Dependencies
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/kyegomez/SelfExtend"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Goekdeniz-Guelmez/mlx-lm-lora
Train Large Language Models on MLX.
uber-research/PPLM
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
VHellendoorn/Code-LMs
Guide to using pre-trained large language models of source code
ssbuild/chatglm_finetuning
chatglm 6b finetuning and alpaca finetuning
jarobyte91/pytorch_beam_search
A lightweight implementation of Beam Search for sequence models in PyTorch.