VITA-Group/TAPE

[ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi Cai, Jason D. Lee, Pan Li, Zhangyang Wang

/ 100

Emerging

This project offers a new way to help large language models (LLMs) understand the order of information in long texts, such as documents, articles, or books. By improving how these models address and process the position of words, it allows them to handle much longer inputs more effectively. This is particularly useful for researchers and practitioners who work with advanced AI models and need them to accurately process very lengthy textual data.

No commits in the last 6 months.

Use this if you are an AI researcher or machine learning engineer training or fine-tuning large language models and struggling with their ability to accurately understand context in extremely long documents.

Not ideal if you are looking for an off-the-shelf application to summarize or analyze documents, as this is a foundational enhancement for language models, not an end-user tool.

Large-Language-Models Natural-Language-Processing AI-Research Long-Document-Understanding Machine-Learning-Engineering

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 10 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

ZHZisZZ/dllm

dLLM: Simple Diffusion Language Modeling

pengzhangzhi/Open-dLLM

Open diffusion language model for code generation — releasing pretraining, evaluation,...

EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. ACM...

THUDM/LongWriter

[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

AIoT-MLSys-Lab/SVD-LLM

[ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2

Explore Transformer Models

All categories Trending Transformer directory Insights