jlamprou/Infini-Attention

Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M context keypass retrieval

27
/ 100
Experimental

This project offers a specialized toolkit for researchers and practitioners working with large language models, helping them process extremely long texts more efficiently. It takes in existing language models and training data, and outputs a modified model capable of understanding and generating responses based on much longer contexts than standard models, without prohibitive computational costs. It's designed for those pushing the boundaries of what LLMs can do with extensive information.

No commits in the last 6 months.

Use this if you are a researcher or advanced practitioner experimenting with large language models that need to process and understand very long documents or conversations, such as entire books or extensive codebases.

Not ideal if you need a production-ready solution for standard language model tasks or if you are not comfortable with experimental, research-stage code.

large-language-models natural-language-processing long-context-AI AI-research text-generation
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 8 / 25
Community 10 / 25

How are scores calculated?

Stars

86

Forks

7

Language

Python

License

Last pushed

May 09, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/jlamprou/Infini-Attention"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.