All Transformer Models
7,795 models ranked by quality score · Page 3 of 78
| # | Model | Score | Tier |
|---|---|---|---|
| 201 |
google-deepmind/long-form-factuality
Benchmarking long-form factuality in large language models. Original code... |
|
Established |
| 202 |
microsoft/torchscale
Foundation Architecture for (M)LLMs |
|
Established |
| 203 |
haizelabs/verdict
Inference-time scaling for LLMs-as-a-judge. |
|
Established |
| 204 |
autonomousvision/transfuser
[PAMI'23] TransFuser: Imitation with Transformer-Based Sensor Fusion for... |
|
Established |
| 205 |
ZHZisZZ/dllm
dLLM: Simple Diffusion Language Modeling |
|
Established |
| 206 |
CrazyBoyM/llama3-Chinese-chat
Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。 |
|
Established |
| 207 |
DadaNanjesha/AI-Text-Humanizer-App
Transform AI-generated text into formal, human-like, and academic writing... |
|
Established |
| 208 |
HowieHwong/TrustLLM
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models |
|
Established |
| 209 |
guinmoon/LLMFarm
llama and other large language models on iOS and MacOS offline using GGML library. |
|
Established |
| 210 |
BeastByteAI/scikit-llm
Seamlessly integrate LLMs into scikit-learn. |
|
Established |
| 211 |
KittenCN/stock_prediction
基于神经网络的通用股票预测模型 A general stock prediction model based on neural networks |
|
Established |
| 212 |
ai-decentralized/BloomBee
Decentralized LLMs fine-tuning and inference with offloading |
|
Established |
| 213 |
ggml-org/llama.vim
Vim plugin for LLM-assisted code/text completion |
|
Established |
| 214 |
b4rtaz/distributed-llama
Distributed LLM inference. Connect home devices into a powerful cluster to... |
|
Established |
| 215 |
Kohulan/DECIMER-Image_Transformer
DECIMER Image Transformer is a deep-learning-based tool designed for... |
|
Established |
| 216 |
roboflow/maestro
streamline the fine-tuning process for multimodal models: PaliGemma 2,... |
|
Established |
| 217 |
edwko/OuteTTS
Interface for OuteTTS models. |
|
Established |
| 218 |
bigscience-workshop/petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x... |
|
Established |
| 219 |
vistec-AI/thai2transformers
Pretraining transformer based Thai language models |
|
Established |
| 220 |
AGI-Arena/MARS
The official implementation of MARS: Unleashing the Power of Variance... |
|
Established |
| 221 |
albertan017/LLM4Decompile
Reverse Engineering: Decompiling Binary Code with Large Language Models |
|
Established |
| 222 |
dropbox/hqq
Official implementation of Half-Quadratic Quantization (HQQ) |
|
Established |
| 223 |
OSUPCVLab/SegFormer3D
Official Implementation of SegFormer3D: an Efficient Transformer for 3D... |
|
Established |
| 224 |
ServerlessLLM/ServerlessLLM
Serverless LLM Serving for Everyone. |
|
Established |
| 225 |
stanfordnlp/axbench
Stanford NLP Python library for benchmarking the utility of LLM... |
|
Established |
| 226 |
toverainc/willow-inference-server
Open source, local, and self-hosted highly optimized language inference... |
|
Established |
| 227 |
thu-ml/SpargeAttn
[ICML2025] SpargeAttention: A training-free sparse attention that... |
|
Established |
| 228 |
Tiiny-AI/PowerInfer
High-speed Large Language Model Serving for Local Deployment |
|
Established |
| 229 |
tairov/llama2.mojo
Inference Llama 2 in one file of pure 🔥 |
|
Established |
| 230 |
Orion-zhen/abliteration
Make abliterated models with transformers, easy and fast |
|
Established |
| 231 |
bytedance/Sa2VA
Official Repo For Pixel-LLM Codebase |
|
Established |
| 232 |
ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing
LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best... |
|
Established |
| 233 |
ggml-org/llama.vscode
VS Code extension for LLM-assisted code/text completion |
|
Established |
| 234 |
om-ai-lab/VLM-R1
Solve Visual Understanding with Reinforced VLMs |
|
Established |
| 235 |
guinmoon/llmfarm_core.swift
Swift library to work with llama and other large language models. |
|
Established |
| 236 |
bytedance/SALMONN
SALMONN family: A suite of advanced multi-modal LLMs |
|
Established |
| 237 |
YerbaPage/LongCodeZip
LongCodeZip: Compress Long Context for Code Language Models [ASE2025] |
|
Established |
| 238 |
Omid-Nejati/MedViTV2
MedViTV2: Medical Image Classification with KAN-Integrated Transformers and... |
|
Established |
| 239 |
ServiceNow/TACTiS
TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time... |
|
Established |
| 240 |
kha-white/manga-ocr
Optical character recognition for Japanese text, with the main focus being... |
|
Established |
| 241 |
sovit-123/vision_transformers
Vision Transformers for image classification, image segmentation, and object... |
|
Established |
| 242 |
Nicolepcx/Transformers-in-Action
This is the corresponding code for the book Transformers in Action |
|
Established |
| 243 |
explosion/spacy-llm
🦙 Integrating LLMs into structured NLP pipelines |
|
Established |
| 244 |
jaehyunnn/ViTPose_pytorch
An unofficial implementation of ViTPose [Y. Xu et al., 2022] |
|
Established |
| 245 |
lonePatient/TorchBlocks
A PyTorch-based toolkit for natural language processing |
|
Established |
| 246 |
AXERA-TECH/ax-llm
Explore LLM model deployment based on AXera's AI chips |
|
Established |
| 247 |
GeeeekExplorer/nano-vllm
Nano vLLM |
|
Established |
| 248 |
aidatatools/ollama-benchmark
LLM Benchmark for Throughput via Ollama (Local LLMs) |
|
Established |
| 249 |
microsoft/mup
maximal update parametrization (µP) |
|
Established |
| 250 |
kyegomez/zeta
Build high-performance AI models with modular building blocks |
|
Established |
| 251 |
JHubi1/ollama-app
A modern and easy-to-use client for Ollama |
|
Established |
| 252 |
lucidrains/locoformer
LocoFormer - Generalist Locomotion via Long-Context Adaptation |
|
Established |
| 253 |
guanwei49/LogLLM
LogLLM: Log-based Anomaly Detection Using Large Language Models (system log... |
|
Established |
| 254 |
PKU-Alignment/align-anything
Align Anything: Training All-modality Model with Feedback |
|
Established |
| 255 |
LarHope/ollama-benchmark
Ollama based Benchmark with detail I/O token per second. Python with... |
|
Established |
| 256 |
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models |
|
Established |
| 257 |
CognitiveAISystems/MAPF-GPT
[AAAI-2025] This repository contains MAPF-GPT, a deep learning-based model... |
|
Established |
| 258 |
jax-ml/jax-llm-examples
Minimal yet performant LLM examples in pure JAX |
|
Established |
| 259 |
Denis2054/Transformers-for-NLP-and-Computer-Vision-3rd-Edition
Transformers 3rd Edition |
|
Established |
| 260 |
SKTBrain/KoBERT
Korean BERT pre-trained cased (KoBERT) |
|
Established |
| 261 |
ashishpatel26/Treasure-of-Transformers
💁 Awesome Treasure of Transformers Models for Natural Language processing... |
|
Established |
| 262 |
google/deepconsensus
DeepConsensus uses gap-aware sequence transformers to correct errors in... |
|
Established |
| 263 |
livepeer/ai-runner
Inference runtime for running different batch and real-time AI pipelines. |
|
Established |
| 264 |
potamides/DeTikZify
Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ. |
|
Established |
| 265 |
UKPLab/gpl
Powerful unsupervised domain adaptation method for dense retrieval. Requires... |
|
Established |
| 266 |
sinanuozdemir/oreilly-hands-on-gpt-llm
Mastering the Art of Scalable and Efficient AI Model Deployment |
|
Established |
| 267 |
ManuelSLemos/RabbitLLM
Run 70B+ LLMs on a single 4GB GPU — no quantization required. |
|
Established |
| 268 |
TsinghuaC3I/MARTI
A Framework for LLM-based Multi-Agent Reinforced Training and Inference |
|
Established |
| 269 |
BiomedSciAI/biomed-multi-omic
Build foundation model for RNA or DNA data |
|
Established |
| 270 |
alibaba/InferSim
A Lightweight LLM Inference Performance Simulator |
|
Established |
| 271 |
IBM/TabFormer
Code & Data for "Tabular Transformers for Modeling Multivariate Time Series"... |
|
Established |
| 272 |
maziyarpanahi/openmed
open-source healthcare ai |
|
Established |
| 273 |
fluxions-ai/vui
100M parameter lightweight conversational text-to-speech model with breaths,... |
|
Established |
| 274 |
CASE-Lab-UMD/LLM-Drop
The official implementation of the paper "Uncovering the Redundancy in... |
|
Established |
| 275 |
nekomeowww/ollama-operator
🚢 Yet another operator for running large language models on Kubernetes with... |
|
Established |
| 276 |
PRIME-RL/TTRL
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning |
|
Established |
| 277 |
Freed-Wu/translate-shell
Translate text by google, bing, youdaozhiyun, haici, stardict, openai, large... |
|
Established |
| 278 |
sb-ai-lab/RePlay
A Comprehensive Framework for Building End-to-End Recommendation Systems... |
|
Established |
| 279 |
xaviviro/python-toon
🐍 TOON for Python (Token-Oriented Object Notation) Encoder/Decoder - Reduce... |
|
Established |
| 280 |
facebookresearch/LayerSkip
Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative... |
|
Established |
| 281 |
fla-org/flame
🔥 A minimal training framework for scaling FLA models |
|
Established |
| 282 |
VectorInstitute/odyssey
A toolkit for developing foundation models using Electronic Health Record (EHR) data. |
|
Established |
| 283 |
microsoft/vidur
A large-scale simulation framework for LLM inference |
|
Established |
| 284 |
TIGER-AI-Lab/VLM2Vec
This repo contains the code for "VLM2Vec: Training Vision-Language Models... |
|
Established |
| 285 |
thu-nics/C2C
[ICLR'26] The official code implementation for "Cache-to-Cache: Direct... |
|
Established |
| 286 |
eth-sri/matharena
Evaluation of LLMs on latest math competitions |
|
Established |
| 287 |
young-geng/scalax
A simple library for scaling up JAX programs |
|
Established |
| 288 |
UdbhavPrasad072300/Transformer-Implementations
Library - Vanilla, ViT, DeiT, BERT, GPT |
|
Established |
| 289 |
FareedKhan-dev/train-llm-from-scratch
A straightforward method for training your LLM, from downloading data to... |
|
Established |
| 290 |
zhenye234/LLaSA_training
LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis |
|
Established |
| 291 |
TharinduDR/TransQuest
Transformer based translation quality estimation |
|
Established |
| 292 |
foundation-model-stack/fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features,... |
|
Established |
| 293 |
cure-lab/LTSF-Linear
[AAAI-23 Oral] Official implementation of the paper "Are Transformers... |
|
Established |
| 294 |
jeya-maria-jose/Medical-Transformer
Official Pytorch Code for "Medical Transformer: Gated Axial-Attention for... |
|
Established |
| 295 |
kyegomez/RT-X
Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open... |
|
Established |
| 296 |
kmeng01/rome
Locating and editing factual associations in GPT (NeurIPS 2022) |
|
Established |
| 297 |
socialfoundations/folktexts
Evaluate uncertainty, calibration, accuracy, and fairness of LLMs on... |
|
Established |
| 298 |
cdqa-suite/cdQA
⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System. |
|
Established |
| 299 |
NVlabs/OmniVinci
OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and... |
|
Established |
| 300 |
Denis2054/Transformers-for-NLP-2nd-Edition
Transformer models from BERT to GPT-4, environments from Hugging Face to... |
|
Established |