All Transformer Models

7,795 models ranked by quality score · Page 3 of 78

Showing 201–300 of 7,795
# Model Score Tier
201 google-deepmind/long-form-factuality

Benchmarking long-form factuality in large language models. Original code...

55
Established
202 microsoft/torchscale

Foundation Architecture for (M)LLMs

55
Established
203 haizelabs/verdict

Inference-time scaling for LLMs-as-a-judge.

55
Established
204 autonomousvision/transfuser

[PAMI'23] TransFuser: Imitation with Transformer-Based Sensor Fusion for...

55
Established
205 ZHZisZZ/dllm

dLLM: Simple Diffusion Language Modeling

55
Established
206 CrazyBoyM/llama3-Chinese-chat

Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。

55
Established
207 DadaNanjesha/AI-Text-Humanizer-App

Transform AI-generated text into formal, human-like, and academic writing...

55
Established
208 HowieHwong/TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

55
Established
209 guinmoon/LLMFarm

llama and other large language models on iOS and MacOS offline using GGML library.

55
Established
210 BeastByteAI/scikit-llm

Seamlessly integrate LLMs into scikit-learn.

55
Established
211 KittenCN/stock_prediction

基于神经网络的通用股票预测模型 A general stock prediction model based on neural networks

55
Established
212 ai-decentralized/BloomBee

Decentralized LLMs fine-tuning and inference with offloading

55
Established
213 ggml-org/llama.vim

Vim plugin for LLM-assisted code/text completion

55
Established
214 b4rtaz/distributed-llama

Distributed LLM inference. Connect home devices into a powerful cluster to...

55
Established
215 Kohulan/DECIMER-Image_Transformer

DECIMER Image Transformer is a deep-learning-based tool designed for...

55
Established
216 roboflow/maestro

streamline the fine-tuning process for multimodal models: PaliGemma 2,...

55
Established
217 edwko/OuteTTS

Interface for OuteTTS models.

55
Established
218 bigscience-workshop/petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x...

54
Established
219 vistec-AI/thai2transformers

Pretraining transformer based Thai language models

54
Established
220 AGI-Arena/MARS

The official implementation of MARS: Unleashing the Power of Variance...

54
Established
221 albertan017/LLM4Decompile

Reverse Engineering: Decompiling Binary Code with Large Language Models

54
Established
222 dropbox/hqq

Official implementation of Half-Quadratic Quantization (HQQ)

54
Established
223 OSUPCVLab/SegFormer3D

Official Implementation of SegFormer3D: an Efficient Transformer for 3D...

54
Established
224 ServerlessLLM/ServerlessLLM

Serverless LLM Serving for Everyone.

54
Established
225 stanfordnlp/axbench

Stanford NLP Python library for benchmarking the utility of LLM...

54
Established
226 toverainc/willow-inference-server

Open source, local, and self-hosted highly optimized language inference...

54
Established
227 thu-ml/SpargeAttn

[ICML2025] SpargeAttention: A training-free sparse attention that...

54
Established
228 Tiiny-AI/PowerInfer

High-speed Large Language Model Serving for Local Deployment

54
Established
229 tairov/llama2.mojo

Inference Llama 2 in one file of pure 🔥

54
Established
230 Orion-zhen/abliteration

Make abliterated models with transformers, easy and fast

54
Established
231 bytedance/Sa2VA

Official Repo For Pixel-LLM Codebase

54
Established
232 ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing

LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best...

54
Established
233 ggml-org/llama.vscode

VS Code extension for LLM-assisted code/text completion

54
Established
234 om-ai-lab/VLM-R1

Solve Visual Understanding with Reinforced VLMs

54
Established
235 guinmoon/llmfarm_core.swift

Swift library to work with llama and other large language models.

54
Established
236 bytedance/SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

54
Established
237 YerbaPage/LongCodeZip

LongCodeZip: Compress Long Context for Code Language Models [ASE2025]

54
Established
238 Omid-Nejati/MedViTV2

MedViTV2: Medical Image Classification with KAN-Integrated Transformers and...

54
Established
239 ServiceNow/TACTiS

TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time...

54
Established
240 kha-white/manga-ocr

Optical character recognition for Japanese text, with the main focus being...

54
Established
241 sovit-123/vision_transformers

Vision Transformers for image classification, image segmentation, and object...

53
Established
242 Nicolepcx/Transformers-in-Action

This is the corresponding code for the book Transformers in Action

53
Established
243 explosion/spacy-llm

🦙 Integrating LLMs into structured NLP pipelines

53
Established
244 jaehyunnn/ViTPose_pytorch

An unofficial implementation of ViTPose [Y. Xu et al., 2022]

53
Established
245 lonePatient/TorchBlocks

A PyTorch-based toolkit for natural language processing

53
Established
246 AXERA-TECH/ax-llm

Explore LLM model deployment based on AXera's AI chips

53
Established
247 GeeeekExplorer/nano-vllm

Nano vLLM

53
Established
248 aidatatools/ollama-benchmark

LLM Benchmark for Throughput via Ollama (Local LLMs)

53
Established
249 microsoft/mup

maximal update parametrization (µP)

53
Established
250 kyegomez/zeta

Build high-performance AI models with modular building blocks

53
Established
251 JHubi1/ollama-app

A modern and easy-to-use client for Ollama

53
Established
252 lucidrains/locoformer

LocoFormer - Generalist Locomotion via Long-Context Adaptation

53
Established
253 guanwei49/LogLLM

LogLLM: Log-based Anomaly Detection Using Large Language Models (system log...

53
Established
254 PKU-Alignment/align-anything

Align Anything: Training All-modality Model with Feedback

53
Established
255 LarHope/ollama-benchmark

Ollama based Benchmark with detail I/O token per second. Python with...

53
Established
256 BradyFU/Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

53
Established
257 CognitiveAISystems/MAPF-GPT

[AAAI-2025] This repository contains MAPF-GPT, a deep learning-based model...

53
Established
258 jax-ml/jax-llm-examples

Minimal yet performant LLM examples in pure JAX

53
Established
259 Denis2054/Transformers-for-NLP-and-Computer-Vision-3rd-Edition

Transformers 3rd Edition

53
Established
260 SKTBrain/KoBERT

Korean BERT pre-trained cased (KoBERT)

53
Established
261 ashishpatel26/Treasure-of-Transformers

💁 Awesome Treasure of Transformers Models for Natural Language processing...

53
Established
262 google/deepconsensus

DeepConsensus uses gap-aware sequence transformers to correct errors in...

53
Established
263 livepeer/ai-runner

Inference runtime for running different batch and real-time AI pipelines.

53
Established
264 potamides/DeTikZify

Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ.

53
Established
265 UKPLab/gpl

Powerful unsupervised domain adaptation method for dense retrieval. Requires...

52
Established
266 sinanuozdemir/oreilly-hands-on-gpt-llm

Mastering the Art of Scalable and Efficient AI Model Deployment

52
Established
267 ManuelSLemos/RabbitLLM

Run 70B+ LLMs on a single 4GB GPU — no quantization required.

52
Established
268 TsinghuaC3I/MARTI

A Framework for LLM-based Multi-Agent Reinforced Training and Inference

52
Established
269 BiomedSciAI/biomed-multi-omic

Build foundation model for RNA or DNA data

52
Established
270 alibaba/InferSim

A Lightweight LLM Inference Performance Simulator

52
Established
271 IBM/TabFormer

Code & Data for "Tabular Transformers for Modeling Multivariate Time Series"...

52
Established
272 maziyarpanahi/openmed

open-source healthcare ai

52
Established
273 fluxions-ai/vui

100M parameter lightweight conversational text-to-speech model with breaths,...

52
Established
274 CASE-Lab-UMD/LLM-Drop

The official implementation of the paper "Uncovering the Redundancy in...

52
Established
275 nekomeowww/ollama-operator

🚢 Yet another operator for running large language models on Kubernetes with...

52
Established
276 PRIME-RL/TTRL

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

52
Established
277 Freed-Wu/translate-shell

Translate text by google, bing, youdaozhiyun, haici, stardict, openai, large...

52
Established
278 sb-ai-lab/RePlay

A Comprehensive Framework for Building End-to-End Recommendation Systems...

52
Established
279 xaviviro/python-toon

🐍 TOON for Python (Token-Oriented Object Notation) Encoder/Decoder - Reduce...

52
Established
280 facebookresearch/LayerSkip

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative...

52
Established
281 fla-org/flame

🔥 A minimal training framework for scaling FLA models

52
Established
282 VectorInstitute/odyssey

A toolkit for developing foundation models using Electronic Health Record (EHR) data.

52
Established
283 microsoft/vidur

A large-scale simulation framework for LLM inference

52
Established
284 TIGER-AI-Lab/VLM2Vec

This repo contains the code for "VLM2Vec: Training Vision-Language Models...

52
Established
285 thu-nics/C2C

[ICLR'26] The official code implementation for "Cache-to-Cache: Direct...

52
Established
286 eth-sri/matharena

Evaluation of LLMs on latest math competitions

52
Established
287 young-geng/scalax

A simple library for scaling up JAX programs

52
Established
288 UdbhavPrasad072300/Transformer-Implementations

Library - Vanilla, ViT, DeiT, BERT, GPT

52
Established
289 FareedKhan-dev/train-llm-from-scratch

A straightforward method for training your LLM, from downloading data to...

52
Established
290 zhenye234/LLaSA_training

LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

52
Established
291 TharinduDR/TransQuest

Transformer based translation quality estimation

52
Established
292 foundation-model-stack/fms-fsdp

🚀 Efficiently (pre)training foundation models with native PyTorch features,...

52
Established
293 cure-lab/LTSF-Linear

[AAAI-23 Oral] Official implementation of the paper "Are Transformers...

51
Established
294 jeya-maria-jose/Medical-Transformer

Official Pytorch Code for "Medical Transformer: Gated Axial-Attention for...

51
Established
295 kyegomez/RT-X

Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open...

51
Established
296 kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

51
Established
297 socialfoundations/folktexts

Evaluate uncertainty, calibration, accuracy, and fairness of LLMs on...

51
Established
298 cdqa-suite/cdQA

⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.

51
Established
299 NVlabs/OmniVinci

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and...

51
Established
300 Denis2054/Transformers-for-NLP-2nd-Edition

Transformer models from BERT to GPT-4, environments from Hugging Face to...

51
Established
« Prev 1 2 3 4 5 76 77 78 Next »