All Transformer Models

7,795 models ranked by quality score · Page 12 of 78

Showing 1101–1200 of 7,795
# Model Score Tier
1101 huggingface/llm_training_handbook

An open collection of methodologies to help with successful training of...

41
Emerging
1102 KolosalAI/Kolosal

Kolosal AI is an OpenSource and Lightweight alternative to LM Studio to run...

41
Emerging
1103 FoundationVision/Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual...

41
Emerging
1104 KennethEnevoldsen/spacy-wrap

spaCy-wrap is a wrapper library for spaCy for including fine-tuned...

41
Emerging
1105 fangpin/llm-from-scratch

Build LLM from scratch

41
Emerging
1106 amazon-science/AdaRec

Adaptive Generative Recommendations with Large Language Models

41
Emerging
1107 KasperGroesLudvigsen/influenza_transformer

PyTorch implementation of Transformer model used in "Deep Transformer Models...

41
Emerging
1108 brontoguana/krasis

Krasis is a Hybrid LLM runtime which focuses on efficient running of larger...

41
Emerging
1109 erogol/BlaGPT

Experimental playground for benchmarking language model (LM) architectures,...

41
Emerging
1110 princeton-nlp/CharXiv

[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in...

41
Emerging
1111 pratyushasharma/laser

The Truth Is In There: Improving Reasoning in Language Models with...

41
Emerging
1112 microsoft/batch-inference

Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT...

41
Emerging
1113 mlvlab/Flipped-VQA

Large Language Models are Temporal and Causal Reasoners for Video Question...

41
Emerging
1114 HHousen/DocSum

A tool to automatically summarize documents abstractively using the BART or...

41
Emerging
1115 skyloevil/llm-scratch-pytorch

lm-scratch-pytorch - The code is designed to be beginner-friendly, with a...

41
Emerging
1116 soda-inria/carte

Repository for CARTE: Context-Aware Representation of Table Entries

41
Emerging
1117 antoyang/FrozenBiLM

[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional...

41
Emerging
1118 Architect2040/metalQwen3

💻 Implement Qwen3 transformer model on macOS using Metal GPU for...

41
Emerging
1119 Liuhong99/Sophia

The official implementation of “Sophia: A Scalable Stochastic Second-order...

41
Emerging
1120 Eclipsess/Awesome-Efficient-Reasoning-LLMs

[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large...

41
Emerging
1121 liangyuwang/zo2

ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with...

41
Emerging
1122 dashstander/block-recurrent-transformer

Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag...

41
Emerging
1123 TencentARC/LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.

41
Emerging
1124 Breeze648/Transformer-from-Scratch

本仓库定位为 AI论文复现 / 从零实现 Transformer。 ...

41
Emerging
1125 GT4SD/zero-shot-bert-adapters

Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.

41
Emerging
1126 kohjingyu/gill

🐟 Code and models for the NeurIPS 2023 paper "Generating Images with...

41
Emerging
1127 flixpar/med-ts-llm

MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis

41
Emerging
1128 YeonwooSung/ai_book

AI book for everyone

41
Emerging
1129 stay-leave/enhance_llm

大模型相关实践记录

41
Emerging
1130 CTCycle/ADSMOD-Adsorption-Modeling

Streamline adsorption modeling by automatically fitting theoretical...

41
Emerging
1131 XunhaoLai/native-sparse-attention-triton

Efficient triton implementation of Native Sparse Attention.

41
Emerging
1132 kyegomez/HSSS

Implementation of a Hierarchical Mamba as described in the paper:...

41
Emerging
1133 ChaitanyaK77/Building-a-Small-Language-Model-SLM-

This Repository provides a Jupyter Notebook for building a small language...

41
Emerging
1134 VectorInstitute/vectorlm

LLM finetuning in resource-constrained environments.

41
Emerging
1135 bigscience-workshop/xmtf

Crosslingual Generalization through Multitask Finetuning

41
Emerging
1136 OpenGVLab/VisionLLM

VisionLLM Series

41
Emerging
1137 NVlabs/DoRA

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed...

41
Emerging
1138 willxxy/ECG-Bench

A Unified Framework for Benchmarking Generative Electrocardiogram-Language...

41
Emerging
1139 mohammadtavakoli78/BEAM

[ICLR 2026] Beyond a Million Tokens: Benchmarking and Enhancing Long-Term...

41
Emerging
1140 Yeonghun1675/L2M3

Large Language Models Material Miner

41
Emerging
1141 kohjingyu/fromage

🧀 Code and models for the ICML 2023 paper "Grounding Language Models to...

41
Emerging
1142 ashioyajotham/fingpt_trader

A quant trading system platform based on FinGPT, demonstrating new...

41
Emerging
1143 kossisoroyce/timber

Ollama for classical ML models. AOT compiler that turns XGBoost, LightGBM,...

41
Emerging
1144 aihao2000/DPN-LLaVA

Arxiv 25: Dynamic Pyramid Network for Efficient Multimodal Large Language Model

41
Emerging
1145 emapco/rk-transformers

Export and Run Hugging Face Transformers Models on Rockchip NPUs

41
Emerging
1146 calcuis/gguf-core

a simple way to interact llama with gguf

41
Emerging
1147 VITA-MLLM/Freeze-Omni

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with...

41
Emerging
1148 airaria/Visual-Chinese-LLaMA-Alpaca

多模态中文LLaMA&Alpaca大语言模型(VisualCLA)

41
Emerging
1149 MIV-XJTU/JanusVLN

[ICLR2026] Official implementation for "JanusVLN: Decoupling Semantics and...

41
Emerging
1150 pjlab-sys4nlp/llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual...

41
Emerging
1151 shikiw/OPERA

[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large...

41
Emerging
1152 WindyLab/ConsensusLLM-code

Source code of our paper "Multi-Agent Consensus Seeking via Large Language Models".

41
Emerging
1153 cloudguruab/modsysML

Human reinforcement learning (RLHF) framework for AI models. Evaluate and...

41
Emerging
1154 somosnlp/nlp-de-cero-a-cien

Curso práctico: NLP de cero a cien 🤗

41
Emerging
1155 Guitaricet/relora

Official code for ReLoRA from the paper Stack More Layers Differently:...

41
Emerging
1156 The-FinAI/CALM

A LLM training and evaluation benchmark for credit scoring

41
Emerging
1157 Beomi/Gemma-EasyLM

Train GEMMA on TPU/GPU! (Codebase for training Gemma-Ko Series)

41
Emerging
1158 synacktraa/tool-parse

Making LLM Tool-Calling Simpler.

41
Emerging
1159 jerryshell/resumind

AI 智能简历分析系统,为每个职位定制专属反馈与 ATS 评分

41
Emerging
1160 wgcban/HyperTransformer

[CVPR'22] HyperTransformer: A Textural and Spectral Feature Fusion...

41
Emerging
1161 princeton-nlp/LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

41
Emerging
1162 Multi-Agent-LLMs/mallm

Framework: Multi-Agent LLMs For Conversational Task-Solving (MALLM)

41
Emerging
1163 octanove/shiba

Pytorch implementation and pre-trained Japanese model for CANINE, the...

41
Emerging
1164 godatadriven/rhyme-with-ai

Rhyme with AI

41
Emerging
1165 datawhalechina/llm-cookbook

面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版

41
Emerging
1166 liuqidong07/MOELoRA-peft

[SIGIR'24] The official implementation code of MOELoRA.

41
Emerging
1167 jesus3476/Fire-Detection-Siglip2

Fire-Detection-Siglip2 is an image classification vision-language encoder...

41
Emerging
1168 dataflowr/llm_efficiency

KV Cache & LoRA for minGPT

41
Emerging
1169 ronniross/attention-heatmap-visualizer

A set of scripts to generate full attention-head heatmaps for transformer-based LLMs

41
Emerging
1170 robinniesert/kaggle-google-quest

Google QUEST Q&A Labeling Kaggle Competition 6th Place Solution

41
Emerging
1171 SqueezeAILab/SqueezeLLM

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

41
Emerging
1172 HamedBabaei/LLMs4OL

LLMs4OL:‌ Large Language Models for Ontology Learning

41
Emerging
1173 Haiyang-W/TokenFormer

[ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking...

41
Emerging
1174 fahadshamshad/awesome-transformers-in-medical-imaging

A collection of resources on applications of Transformers in Medical Imaging.

41
Emerging
1175 radlab-dev-group/llm-router

LLM Router is a service that can be deployed on‑premises or in the cloud. It...

41
Emerging
1176 jla524/fromthetensor

From the Tensor to Stable Diffusion, a rough outline for a 10 week course.

41
Emerging
1177 php-llm/llm-chain

PHP library for building LLM-based and AI-based features and applications.

41
Emerging
1178 jaypatel15406/Ollama-Adaptive-Image-Code-Gen

Ollama Adaptive Image Code Gen is an asynchronous Python application that...

41
Emerging
1179 KittenCN/predict_Lottery_ticket_pytorch

pytorch下基于transformer / LSTM模型的彩票预测

41
Emerging
1180 punica-ai/punica

Serving multiple LoRA finetuned LLM as one

41
Emerging
1181 exasol/transformers-extension

An Exasol extension for using state-of-the-art pretrained machine learning...

41
Emerging
1182 sandy1990418/Finetune-Qwen2.5-VL

Fine-tuning Qwen2.5-VL for vision-language tasks | Optimized for Vision...

41
Emerging
1183 Intelligent-CAT-Lab/PLTranslationEmpirical

Artifact repository for the paper "Lost in Translation: A Study of Bugs...

41
Emerging
1184 NVIDIA/logits-processor-zoo

A collection of LogitsProcessors to customize and enhance LLM behavior for...

41
Emerging
1185 K-H-Ismail/torchortho

[ICLR 2026] Polynomial, trigonometric, and tropical activations

41
Emerging
1186 fardjad/node-llmatic

Use self-hosted LLMs with an OpenAI compatible API

41
Emerging
1187 ymcui/PERT

PERT: Pre-training BERT with Permuted Language Model

41
Emerging
1188 Ethan-yt/guwenbert

GuwenBERT: 古文预训练语言模型(古文BERT) A Pre-trained Language Model for Classical...

41
Emerging
1189 TIGER-AI-Lab/QuickVideo

Quick Long Video Understanding [TMLR2025]

41
Emerging
1190 piresramon/gpt-4-enem

Code and data to evaluate LLMs on the ENEM, the main standardized Brazilian...

41
Emerging
1191 haesleinhuepf/human-eval-bia

Benchmarking Large Language Models for Bio-Image Analysis Code Generation

41
Emerging
1192 NVIDIA/Star-Attention

Efficient LLM Inference over Long Sequences

40
Emerging
1193 jianzhnie/awesome-instruction-datasets

A collection of awesome-prompt-datasets, awesome-instruction-dataset, to...

40
Emerging
1194 pogzyb/tourist

Open-source, LLM-ready SERP and web scraping service

40
Emerging
1195 monologg/korean-hate-speech-koelectra

Bias, Hate classification with KoELECTRA 👿

40
Emerging
1196 Hsu1023/DuQuant

[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation...

40
Emerging
1197 monologg/KoCharELECTRA

Character-level Korean ELECTRA Model (음절 단위 한국어 ELECTRA)

40
Emerging
1198 teticio/llama-squad

Train Llama 2 & 3 on the SQuAD v2 task as an example of how to specialize a...

40
Emerging
1199 elapse-annals/laravel-plus

Based on Laravel transformation and expansion, more convenient for practical...

40
Emerging
1200 HHousen/speaker-change-detection

Speaker change detection using SincNet and an LSTM/Transformer

40
Emerging
« Prev 1 2 3 10 11 12 13 14 76 77 78 Next »