All Transformer Models
7,795 models ranked by quality score · Page 20 of 78
| # | Model | Score | Tier |
|---|---|---|---|
| 1901 |
sh0416/llama-classification
Text classification with Foundation Language Model LLaMA |
|
Emerging |
| 1902 |
declare-lab/LLM-PuzzleTest
This repository is maintained to release dataset and models for multimodal... |
|
Emerging |
| 1903 |
jeffreysijuntan/lloco
The official repo for "LLoCo: Learning Long Contexts Offline" |
|
Emerging |
| 1904 |
R-D-BioTech-Alaska/Brain
Brain is an innovative concept that combines Qelm with Nueron to harness the... |
|
Emerging |
| 1905 |
HenryHZY/Awesome-Multimodal-LLM
Research Trends in LLM-guided Multimodal Learning. |
|
Emerging |
| 1906 |
vorobeevich/ml-snippets-classification
The source code of "Machine learning code snippets semantic classification"... |
|
Emerging |
| 1907 |
pleisto/yuren-baichuan-7b
基于baichuan-7b的开源多模态大语言模型 |
|
Emerging |
| 1908 |
surrey-nlp/PLOD-AbbreviationDetection
This repository contains the PLOD Dataset for Abbreviation Detection... |
|
Emerging |
| 1909 |
rese1f/aurora
[ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a... |
|
Emerging |
| 1910 |
TIGER-AI-Lab/MAmmoTH
Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid... |
|
Emerging |
| 1911 |
zhenyi4/ssa
Official repository for "SSA: Sparse Sparse Attention by Aligning Full and... |
|
Emerging |
| 1912 |
ChuloAI/BrainChulo
Harnessing the Memory Power of the Camelids |
|
Emerging |
| 1913 |
SeekingDream/DyCodeEval
Official repository of the ICML2025 paper “Dynamic Benchmarking of Reasoning... |
|
Emerging |
| 1914 |
diogok/llama.cpp.zig
A build.zig for llama.cpp |
|
Emerging |
| 1915 |
RhinoDevel/mt_llm
Pure C wrapper library to use llama.cpp with Linux and Windows as simple as... |
|
Emerging |
| 1916 |
jaketae/param-share-transformer
PyTorch implementation of Lessons on Parameter Sharing across Layers in Transformers |
|
Emerging |
| 1917 |
jordandeklerk/SwinViT
Modified Swin Transformer model in PyTorch on CIFAR-10 for image classification |
|
Emerging |
| 1918 |
Am1n3e/active-learning-transformer
A hands-on tutorial on how to use Active Learning with Transformer models. |
|
Emerging |
| 1919 |
frankaging/ReCOGS
ReCOGS: How Incidental Details of a Logical Form Overshadow an Evaluation of... |
|
Emerging |
| 1920 |
GiannakopoulosIlias/vision-transformer-network-for-mr-electrical-properties-tomography
A 3D Vision Transformer-based neural network for reconstructing electrical... |
|
Emerging |
| 1921 |
SkyworkAI/MoE-plus-plus
[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with... |
|
Emerging |
| 1922 |
yinboc/trans-inr
Transformers as Meta-Learners for Implicit Neural Representations, in ECCV 2022 |
|
Emerging |
| 1923 |
mrdbourke/mac-ml-speed-test
A few quick scripts focused on testing TensorFlow/PyTorch/Llama 2 on macOS. |
|
Emerging |
| 1924 |
avocardio/Zicklein
Finetuning instruct-LLaMA on german datasets. |
|
Emerging |
| 1925 |
alphasecio/llama-guard
A web app for exploring content moderation with Llama Guard on Groq. |
|
Emerging |
| 1926 |
m0dulo/InferSpore
🌱 A fully independent Large Language Model (LLM) inference engine, built... |
|
Emerging |
| 1927 |
lechmazur/writing
This benchmark tests how well LLMs incorporate a set of 10 mandatory story... |
|
Emerging |
| 1928 |
general-preference/general-preference-model
[ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for... |
|
Emerging |
| 1929 |
dreamingjudith/KoGPT2-personachat
Fine-tuned KoGPT2 chatbot demo with translated PersonaChat (ongoing) |
|
Emerging |
| 1930 |
Relaxed-System-Lab/Flash-Sparse-Attention
🚀🚀 Efficient implementations of Native Sparse Attention |
|
Emerging |
| 1931 |
bnosac/golgotha
Contextualised Embeddings and Language Modelling using BERT and Friends using R |
|
Emerging |
| 1932 |
dev-sufyaan/Nexlify
Unified API platform for free access to enterprise-grade AI models from... |
|
Emerging |
| 1933 |
modal-labs/stopwatch
A tool for benchmarking LLMs on Modal |
|
Emerging |
| 1934 |
xuyang-liu16/GlobalCom2
[AAAI 2026] Global Compression Commander: Plug-and-Play Inference... |
|
Emerging |
| 1935 |
AIFrameResearch/SPO
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL... |
|
Emerging |
| 1936 |
yifanzhang-pro/HLA
Official Project Page for HLA: Higher-order Linear Attention... |
|
Emerging |
| 1937 |
FengheTan9/LLM4Seg
[MICCAI 2025] Official code for "Pre-Trained LLM is a Semantic-Aware and... |
|
Emerging |
| 1938 |
jqtangust/Robust-R1
🔥🔥🔥[AAAI 2026 Oral] Official Implementation of Robust-R1: Degradation-Aware... |
|
Emerging |
| 1939 |
FSoft-AI4Code/CodeCapybara
Open-source Self-Instruction Tuning Code LLM |
|
Emerging |
| 1940 |
moeru-ai/demodel
🚀🛸 Easily boost the speed of pulling your models and datasets from various... |
|
Emerging |
| 1941 |
NVlabs/RocketKV
[ICML 2025] RocketKV: Accelerating Long-Context LLM Inference via Two-Stage... |
|
Emerging |
| 1942 |
dobriban/Principles-of-AI-LLMs
Materials for the course Principles of AI: LLMs at UPenn (Stat 9911, Spring... |
|
Emerging |
| 1943 |
BatsResearch/trove
A Flexible Toolkit for Dense Retrieval |
|
Emerging |
| 1944 |
twiecki/transpailer
LLM-based, self-correcting transpiler, supports Jax, PyToch, Rust, PyMC, Stan |
|
Emerging |
| 1945 |
cankocagil/SwinDetr
Integration of Swin Transformer to DETR for Robust Object Detection (DEMO) |
|
Emerging |
| 1946 |
retarfi/language-pretraining
Pre-training Language Models for Japanese |
|
Emerging |
| 1947 |
abcsys/libem
Compound AI toolchain for fast and accurate entity matching, powered by LLMs. |
|
Emerging |
| 1948 |
devdhananjay14/multim
🔍 Experiment with neural networks for binary classification on multimodal... |
|
Emerging |
| 1949 |
harryjdavies/HeartGPT
Interpretable Pre-Trained Transformers for Heart Time-Series Data |
|
Emerging |
| 1950 |
kaist-cvml/I-HallA-v1.0
[AAAI 2025] Official Implementation of I-HallA v1.0 |
|
Emerging |
| 1951 |
shahrukhx01/siamese-nn-semantic-text-similarity
A repository containing comprehensive Neural Networks based PyTorch... |
|
Emerging |
| 1952 |
kaistAI/Janus
[NeurIPS 2024] Train LLMs with diverse system messages reflecting... |
|
Emerging |
| 1953 |
shahriargolchin/time-travel-in-llms
The official repository for the paper entitled "Time Travel in LLMs: Tracing... |
|
Emerging |
| 1954 |
chaitjo/gated-graph-transformers
Transformers are Graph Neural Networks! |
|
Emerging |
| 1955 |
tlc4418/llm_optimization
A repo for RLHF training and BoN over LLMs, with support for reward model ensembles. |
|
Emerging |
| 1956 |
Samyak-777/nomodel
The world's most accurate LLM. It achieves 0% hallucination rate by... |
|
Emerging |
| 1957 |
gtausa197-svg/-Project-Nord-Spiking-Neural-Network-Language-Model
The first pure SNN language model trained from scratch with a fully original... |
|
Emerging |
| 1958 |
msakarvadia/memorization
Localizing Memorized Sequences in Language Models |
|
Emerging |
| 1959 |
kyegomez/PALI
Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model" |
|
Emerging |
| 1960 |
abhisheknair10/llama3.cu
Lightweight Llama 3 8B Inference Engine in CUDA C |
|
Emerging |
| 1961 |
HKUDS/SepLLM
[ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One... |
|
Emerging |
| 1962 |
Mmorgan-ML/Neuromodulatory-Control-Networks
Neuromodulatory Control Networks (NCNs), a novel LLM architectural... |
|
Emerging |
| 1963 |
Anshita1Saxena/transformer_time_series_forecasting
Transformers applied on Time Series Forecasting |
|
Emerging |
| 1964 |
Jathurshan0330/Cross-Modal-Transformer
Official repository of cross-modal transformer for interpretable automatic... |
|
Emerging |
| 1965 |
yashbonde/rasp
Implementing RASP transformer programming language... |
|
Emerging |
| 1966 |
llcuda/llcuda
CUDA 12-first backend inference for Unsloth on Kaggle — Optimized for small... |
|
Emerging |
| 1967 |
vipulraheja/iterater
Official implementation of the paper "IteraTeR: Understanding Iterative... |
|
Emerging |
| 1968 |
nikolaydubina/llama2.go
LLaMA-2 in native Go |
|
Emerging |
| 1969 |
lfunderburk/automate-tech-post
LLM application: fine tuned model to generate social media posts from... |
|
Emerging |
| 1970 |
andrewliao11/LongPerceptualThoughts
[COLM'25] The official implementation of "LongPerceptualThoughts: Distilling... |
|
Emerging |
| 1971 |
vbdi/divprune
[CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large... |
|
Emerging |
| 1972 |
oshindutta/TVAprune
[ICML 2024 Es-FoMo] - Efficient LLM Pruning with Global Token-Dependency... |
|
Emerging |
| 1973 |
arshadshk/SAINT-pytorch
SAINT PyTorch implementation |
|
Emerging |
| 1974 |
uncbiag/Awesome-Foundation-Models
A curated list of foundation models for vision and language tasks |
|
Emerging |
| 1975 |
Cryolite/kanachan
A Japanese (Riichi) Mahjong AI Framework |
|
Emerging |
| 1976 |
palonso/MAEST
Pre-training, fine-tuning, and inference code with the MAEST models for... |
|
Emerging |
| 1977 |
automorphic-ai/trex
Enforce structured output from LLMs 100% of the time |
|
Emerging |
| 1978 |
vietanhdev/llama-assistant-train
Training Scripts for Llama Assistant: Your Local AI Assistant That Respects... |
|
Emerging |
| 1979 |
TayeeChang/keras_transformers
the implement of transformer family such as bert, alber, roberta, nezha, etc. |
|
Emerging |
| 1980 |
alibaba/easydist
Automated Parallelization System and Infrastructure for Multiple Ecosystems |
|
Emerging |
| 1981 |
wxjiao/ParroT
The ParroT framework to enhance and regulate the Translation Abilities... |
|
Emerging |
| 1982 |
Ankur3107/nlp_notebooks
Tensorflow, Pytorch, Huggingface Transformer, Fastai, etc. tutorial Colab Notebooks. |
|
Emerging |
| 1983 |
kyegomez/MambaDecoderBlock
MambaDecoderBlock is a novel decoder architecture that replaces traditional... |
|
Emerging |
| 1984 |
Airmomo/transformers-docs-zh
【持续更新中】 完全中文版的 Transformers 学习笔记及演示示例,支持 Jupyter Notebook,主要内容来自 🤗 Hugging... |
|
Emerging |
| 1985 |
EvanZhouDev/llm.pdf
Run LLMs inside a PDF file. |
|
Emerging |
| 1986 |
TideDra/VL-RLHF
A RLHF Infrastructure for Vision-Language Models |
|
Emerging |
| 1987 |
OnlyTerp/turboquant
First open-source implementation of Google TurboQuant (ICLR 2026) --... |
|
Emerging |
| 1988 |
wangcongcong123/ttt
A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+ |
|
Emerging |
| 1989 |
astrobleem/Simple-StableLM-Chat
This is a very simple python app that you can use to get up and chatting... |
|
Emerging |
| 1990 |
uakarsh/latr
Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel... |
|
Emerging |
| 1991 |
tranquoctrinh/transformer
This is a PyTorch implementation of the Transformer model in the paper... |
|
Emerging |
| 1992 |
hoof-ai/hoof
"Just hoof it!" - A spotlight like interface to Ollama |
|
Emerging |
| 1993 |
ntt-dkiku/route-explainer
The official implementation of "RouteExplainer: An Explanation Framework for... |
|
Emerging |
| 1994 |
Mya-Mya/CBF-LLM
"CBF-LLM: Safe Control for LLM Alignment" |
|
Emerging |
| 1995 |
BaohaoLiao/RSD
[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and... |
|
Emerging |
| 1996 |
sail-sg/Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical... |
|
Emerging |
| 1997 |
BUAADreamer/SPN4CIR
[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning... |
|
Emerging |
| 1998 |
OatmealLiu/FineR
[ICLR'24] Democratizing Fine-grained Visual Recognition with Large Language Models |
|
Emerging |
| 1999 |
frankluise5220/ComfyUI-Lorahelper
A professional automation toolkit for ComfyUI to prepare LoRA training data... |
|
Emerging |
| 2000 |
CogitoNTNU/course-on-large-language-models
This is a course on how to to program with Large Language Models. |
|
Emerging |