XunhaoLai/ring-sliding-window-attention

Ring sliding window attention implementation with flash attention

22
/ 100
Experimental

This is a specialized tool for machine learning engineers working on large language models. It helps train models more efficiently on very long text sequences by distributing the attention mechanism across multiple GPUs. You input the model's query, key, and value tensors, and it outputs the attention results, enabling faster training for long contexts.

No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher training large language models with very long input sequences and need to leverage multiple GPUs for efficient computation.

Not ideal if you are working with shorter text sequences, or if you are not using a distributed training setup with multiple GPUs.

large-language-models distributed-training deep-learning natural-language-processing GPU-acceleration
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 5 / 25
Maturity 15 / 25
Community 0 / 25

How are scores calculated?

Stars

9

Forks

Language

Python

License

MIT

Last pushed

Jul 25, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/XunhaoLai/ring-sliding-window-attention"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.