All Transformer Models
7,795 models ranked by quality score · Page 12 of 78
| # | Model | Score | Tier |
|---|---|---|---|
| 1101 |
huggingface/llm_training_handbook
An open collection of methodologies to help with successful training of... |
|
Emerging |
| 1102 |
KolosalAI/Kolosal
Kolosal AI is an OpenSource and Lightweight alternative to LM Studio to run... |
|
Emerging |
| 1103 |
FoundationVision/Groma
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual... |
|
Emerging |
| 1104 |
KennethEnevoldsen/spacy-wrap
spaCy-wrap is a wrapper library for spaCy for including fine-tuned... |
|
Emerging |
| 1105 |
fangpin/llm-from-scratch
Build LLM from scratch |
|
Emerging |
| 1106 |
amazon-science/AdaRec
Adaptive Generative Recommendations with Large Language Models |
|
Emerging |
| 1107 |
KasperGroesLudvigsen/influenza_transformer
PyTorch implementation of Transformer model used in "Deep Transformer Models... |
|
Emerging |
| 1108 |
brontoguana/krasis
Krasis is a Hybrid LLM runtime which focuses on efficient running of larger... |
|
Emerging |
| 1109 |
erogol/BlaGPT
Experimental playground for benchmarking language model (LM) architectures,... |
|
Emerging |
| 1110 |
princeton-nlp/CharXiv
[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in... |
|
Emerging |
| 1111 |
pratyushasharma/laser
The Truth Is In There: Improving Reasoning in Language Models with... |
|
Emerging |
| 1112 |
microsoft/batch-inference
Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT... |
|
Emerging |
| 1113 |
mlvlab/Flipped-VQA
Large Language Models are Temporal and Causal Reasoners for Video Question... |
|
Emerging |
| 1114 |
HHousen/DocSum
A tool to automatically summarize documents abstractively using the BART or... |
|
Emerging |
| 1115 |
skyloevil/llm-scratch-pytorch
lm-scratch-pytorch - The code is designed to be beginner-friendly, with a... |
|
Emerging |
| 1116 |
soda-inria/carte
Repository for CARTE: Context-Aware Representation of Table Entries |
|
Emerging |
| 1117 |
antoyang/FrozenBiLM
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional... |
|
Emerging |
| 1118 |
Architect2040/metalQwen3
💻 Implement Qwen3 transformer model on macOS using Metal GPU for... |
|
Emerging |
| 1119 |
Liuhong99/Sophia
The official implementation of “Sophia: A Scalable Stochastic Second-order... |
|
Emerging |
| 1120 |
Eclipsess/Awesome-Efficient-Reasoning-LLMs
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large... |
|
Emerging |
| 1121 |
liangyuwang/zo2
ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with... |
|
Emerging |
| 1122 |
dashstander/block-recurrent-transformer
Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag... |
|
Emerging |
| 1123 |
TencentARC/LLaMA-Pro
[ACL 2024] Progressive LLaMA with Block Expansion. |
|
Emerging |
| 1124 |
Breeze648/Transformer-from-Scratch
本仓库定位为 AI论文复现 / 从零实现 Transformer。 ... |
|
Emerging |
| 1125 |
GT4SD/zero-shot-bert-adapters
Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection. |
|
Emerging |
| 1126 |
kohjingyu/gill
🐟 Code and models for the NeurIPS 2023 paper "Generating Images with... |
|
Emerging |
| 1127 |
flixpar/med-ts-llm
MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis |
|
Emerging |
| 1128 |
YeonwooSung/ai_book
AI book for everyone |
|
Emerging |
| 1129 |
stay-leave/enhance_llm
大模型相关实践记录 |
|
Emerging |
| 1130 |
CTCycle/ADSMOD-Adsorption-Modeling
Streamline adsorption modeling by automatically fitting theoretical... |
|
Emerging |
| 1131 |
XunhaoLai/native-sparse-attention-triton
Efficient triton implementation of Native Sparse Attention. |
|
Emerging |
| 1132 |
kyegomez/HSSS
Implementation of a Hierarchical Mamba as described in the paper:... |
|
Emerging |
| 1133 |
ChaitanyaK77/Building-a-Small-Language-Model-SLM-
This Repository provides a Jupyter Notebook for building a small language... |
|
Emerging |
| 1134 |
VectorInstitute/vectorlm
LLM finetuning in resource-constrained environments. |
|
Emerging |
| 1135 |
bigscience-workshop/xmtf
Crosslingual Generalization through Multitask Finetuning |
|
Emerging |
| 1136 |
OpenGVLab/VisionLLM
VisionLLM Series |
|
Emerging |
| 1137 |
NVlabs/DoRA
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed... |
|
Emerging |
| 1138 |
willxxy/ECG-Bench
A Unified Framework for Benchmarking Generative Electrocardiogram-Language... |
|
Emerging |
| 1139 |
mohammadtavakoli78/BEAM
[ICLR 2026] Beyond a Million Tokens: Benchmarking and Enhancing Long-Term... |
|
Emerging |
| 1140 |
Yeonghun1675/L2M3
Large Language Models Material Miner |
|
Emerging |
| 1141 |
kohjingyu/fromage
🧀 Code and models for the ICML 2023 paper "Grounding Language Models to... |
|
Emerging |
| 1142 |
ashioyajotham/fingpt_trader
A quant trading system platform based on FinGPT, demonstrating new... |
|
Emerging |
| 1143 |
kossisoroyce/timber
Ollama for classical ML models. AOT compiler that turns XGBoost, LightGBM,... |
|
Emerging |
| 1144 |
aihao2000/DPN-LLaVA
Arxiv 25: Dynamic Pyramid Network for Efficient Multimodal Large Language Model |
|
Emerging |
| 1145 |
emapco/rk-transformers
Export and Run Hugging Face Transformers Models on Rockchip NPUs |
|
Emerging |
| 1146 |
calcuis/gguf-core
a simple way to interact llama with gguf |
|
Emerging |
| 1147 |
VITA-MLLM/Freeze-Omni
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with... |
|
Emerging |
| 1148 |
airaria/Visual-Chinese-LLaMA-Alpaca
多模态中文LLaMA&Alpaca大语言模型(VisualCLA) |
|
Emerging |
| 1149 |
MIV-XJTU/JanusVLN
[ICLR2026] Official implementation for "JanusVLN: Decoupling Semantics and... |
|
Emerging |
| 1150 |
pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual... |
|
Emerging |
| 1151 |
shikiw/OPERA
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large... |
|
Emerging |
| 1152 |
WindyLab/ConsensusLLM-code
Source code of our paper "Multi-Agent Consensus Seeking via Large Language Models". |
|
Emerging |
| 1153 |
cloudguruab/modsysML
Human reinforcement learning (RLHF) framework for AI models. Evaluate and... |
|
Emerging |
| 1154 |
somosnlp/nlp-de-cero-a-cien
Curso práctico: NLP de cero a cien 🤗 |
|
Emerging |
| 1155 |
Guitaricet/relora
Official code for ReLoRA from the paper Stack More Layers Differently:... |
|
Emerging |
| 1156 |
The-FinAI/CALM
A LLM training and evaluation benchmark for credit scoring |
|
Emerging |
| 1157 |
Beomi/Gemma-EasyLM
Train GEMMA on TPU/GPU! (Codebase for training Gemma-Ko Series) |
|
Emerging |
| 1158 |
synacktraa/tool-parse
Making LLM Tool-Calling Simpler. |
|
Emerging |
| 1159 |
jerryshell/resumind
AI 智能简历分析系统,为每个职位定制专属反馈与 ATS 评分 |
|
Emerging |
| 1160 |
wgcban/HyperTransformer
[CVPR'22] HyperTransformer: A Textural and Spectral Feature Fusion... |
|
Emerging |
| 1161 |
princeton-nlp/LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning |
|
Emerging |
| 1162 |
Multi-Agent-LLMs/mallm
Framework: Multi-Agent LLMs For Conversational Task-Solving (MALLM) |
|
Emerging |
| 1163 |
octanove/shiba
Pytorch implementation and pre-trained Japanese model for CANINE, the... |
|
Emerging |
| 1164 |
godatadriven/rhyme-with-ai
Rhyme with AI |
|
Emerging |
| 1165 |
datawhalechina/llm-cookbook
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版 |
|
Emerging |
| 1166 |
liuqidong07/MOELoRA-peft
[SIGIR'24] The official implementation code of MOELoRA. |
|
Emerging |
| 1167 |
jesus3476/Fire-Detection-Siglip2
Fire-Detection-Siglip2 is an image classification vision-language encoder... |
|
Emerging |
| 1168 |
dataflowr/llm_efficiency
KV Cache & LoRA for minGPT |
|
Emerging |
| 1169 |
ronniross/attention-heatmap-visualizer
A set of scripts to generate full attention-head heatmaps for transformer-based LLMs |
|
Emerging |
| 1170 |
robinniesert/kaggle-google-quest
Google QUEST Q&A Labeling Kaggle Competition 6th Place Solution |
|
Emerging |
| 1171 |
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization |
|
Emerging |
| 1172 |
HamedBabaei/LLMs4OL
LLMs4OL: Large Language Models for Ontology Learning |
|
Emerging |
| 1173 |
Haiyang-W/TokenFormer
[ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking... |
|
Emerging |
| 1174 |
fahadshamshad/awesome-transformers-in-medical-imaging
A collection of resources on applications of Transformers in Medical Imaging. |
|
Emerging |
| 1175 |
radlab-dev-group/llm-router
LLM Router is a service that can be deployed on‑premises or in the cloud. It... |
|
Emerging |
| 1176 |
jla524/fromthetensor
From the Tensor to Stable Diffusion, a rough outline for a 10 week course. |
|
Emerging |
| 1177 |
php-llm/llm-chain
PHP library for building LLM-based and AI-based features and applications. |
|
Emerging |
| 1178 |
jaypatel15406/Ollama-Adaptive-Image-Code-Gen
Ollama Adaptive Image Code Gen is an asynchronous Python application that... |
|
Emerging |
| 1179 |
KittenCN/predict_Lottery_ticket_pytorch
pytorch下基于transformer / LSTM模型的彩票预测 |
|
Emerging |
| 1180 |
punica-ai/punica
Serving multiple LoRA finetuned LLM as one |
|
Emerging |
| 1181 |
exasol/transformers-extension
An Exasol extension for using state-of-the-art pretrained machine learning... |
|
Emerging |
| 1182 |
sandy1990418/Finetune-Qwen2.5-VL
Fine-tuning Qwen2.5-VL for vision-language tasks | Optimized for Vision... |
|
Emerging |
| 1183 |
Intelligent-CAT-Lab/PLTranslationEmpirical
Artifact repository for the paper "Lost in Translation: A Study of Bugs... |
|
Emerging |
| 1184 |
NVIDIA/logits-processor-zoo
A collection of LogitsProcessors to customize and enhance LLM behavior for... |
|
Emerging |
| 1185 |
K-H-Ismail/torchortho
[ICLR 2026] Polynomial, trigonometric, and tropical activations |
|
Emerging |
| 1186 |
fardjad/node-llmatic
Use self-hosted LLMs with an OpenAI compatible API |
|
Emerging |
| 1187 |
ymcui/PERT
PERT: Pre-training BERT with Permuted Language Model |
|
Emerging |
| 1188 |
Ethan-yt/guwenbert
GuwenBERT: 古文预训练语言模型(古文BERT) A Pre-trained Language Model for Classical... |
|
Emerging |
| 1189 |
TIGER-AI-Lab/QuickVideo
Quick Long Video Understanding [TMLR2025] |
|
Emerging |
| 1190 |
piresramon/gpt-4-enem
Code and data to evaluate LLMs on the ENEM, the main standardized Brazilian... |
|
Emerging |
| 1191 |
haesleinhuepf/human-eval-bia
Benchmarking Large Language Models for Bio-Image Analysis Code Generation |
|
Emerging |
| 1192 |
NVIDIA/Star-Attention
Efficient LLM Inference over Long Sequences |
|
Emerging |
| 1193 |
jianzhnie/awesome-instruction-datasets
A collection of awesome-prompt-datasets, awesome-instruction-dataset, to... |
|
Emerging |
| 1194 |
pogzyb/tourist
Open-source, LLM-ready SERP and web scraping service |
|
Emerging |
| 1195 |
monologg/korean-hate-speech-koelectra
Bias, Hate classification with KoELECTRA 👿 |
|
Emerging |
| 1196 |
Hsu1023/DuQuant
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation... |
|
Emerging |
| 1197 |
monologg/KoCharELECTRA
Character-level Korean ELECTRA Model (음절 단위 한국어 ELECTRA) |
|
Emerging |
| 1198 |
teticio/llama-squad
Train Llama 2 & 3 on the SQuAD v2 task as an example of how to specialize a... |
|
Emerging |
| 1199 |
elapse-annals/laravel-plus
Based on Laravel transformation and expansion, more convenient for practical... |
|
Emerging |
| 1200 |
HHousen/speaker-change-detection
Speaker change detection using SincNet and an LSTM/Transformer |
|
Emerging |