fkodom/yet-another-retnet
A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (https://arxiv.org/pdf/2307.08621.pdf)
This project provides a robust PyTorch implementation of the Retentive Network (RetNet) architecture, which offers an alternative to Transformers for large language models. It allows machine learning researchers and practitioners to input text data and build or train models with comparable accuracy to Transformers, but with more efficient memory use and faster processing for both training and inference. Its target users are machine learning engineers and researchers working on natural language processing tasks.
106 stars. No commits in the last 6 months.
Use this if you are developing large language models and need an architecture that balances high accuracy with superior training and inference efficiency compared to traditional Transformers.
Not ideal if you are working on domains outside of language modeling or require highly specialized, untyped configurations that are common in some established research frameworks.
Stars
106
Forks
17
Language
Python
License
MIT
Category
Last pushed
Nov 24, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/fkodom/yet-another-retnet"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Jittor/jittor
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
zhanghang1989/ResNeSt
ResNeSt: Split-Attention Networks
berniwal/swin-transformer-pytorch
Implementation of the Swin Transformer in PyTorch.
NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with...
ViTAE-Transformer/ViTPose
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose...