All Transformer Models
7,795 models ranked by quality score · Page 14 of 78
| # | Model | Score | Tier |
|---|---|---|---|
| 1301 |
laugustyniak/LLM-demo-2-production
The set of tools and examples for creating LLM-based solutions from demo to... |
|
Emerging |
| 1302 |
alexrozanski/LlamaChat
Chat with your favourite LLaMA models in a native macOS app |
|
Emerging |
| 1303 |
ariannamethod/ariannamethod.ai
Arianna Method Programming Language |
|
Emerging |
| 1304 |
souzatharsis/tamingLLMs
Taming LLMs: A Practical Guide to LLM Pitfalls with Open Source Software |
|
Emerging |
| 1305 |
huggingface/datablations
Scaling Data-Constrained Language Models |
|
Emerging |
| 1306 |
QwenLM/Qwen2.5-Math
A series of math-specific large language models of our Qwen2 series. |
|
Emerging |
| 1307 |
ariannamethod/arianna.c
Arianna is a Digital Persona. Embodied cognition as is. |
|
Emerging |
| 1308 |
snktshrma/ngps_flight
Global vision positioning system for UAVs in outdoor GNSS-denied environments |
|
Emerging |
| 1309 |
cooelf/AwesomeMRC
IJCAI 2021 Tutorial & code for Retrospective Reader for Machine Reading... |
|
Emerging |
| 1310 |
HiThink-Research/MME-Finance
[MM 2025] A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning |
|
Emerging |
| 1311 |
urmzd/md-classifier
A deep learning system combining transformers and CNNs to classify diseases... |
|
Emerging |
| 1312 |
hanouticelina/deformable-DETR
Implementation of the paper : Deformable DETR: Deformable Transformers for... |
|
Emerging |
| 1313 |
abhilash1910/LongPegasus
LongPegasus package is used for inducing longformer self attention over base... |
|
Emerging |
| 1314 |
loong64/llama.cpp
LLM inference in C/C++ |
|
Emerging |
| 1315 |
jhcho99/GSRTR
[BMVC'21] Official PyTorch Implementation of "Grounded Situation Recognition... |
|
Emerging |
| 1316 |
FoundationVision/UniTok
[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding |
|
Emerging |
| 1317 |
zeyadusf/LLMs-from-Scratch
Build a Large Language Model (From Scratch) book and Finetuned Models |
|
Emerging |
| 1318 |
rishikksh20/CrossViT-pytorch
Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer... |
|
Emerging |
| 1319 |
OpenBMB/InfiniteBench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K... |
|
Emerging |
| 1320 |
innightwolfsleep/old_llm_telegram_bot
Connect llama-cpp, transformers or text-generation-webui to telegram bot api. |
|
Emerging |
| 1321 |
TIGER-AI-Lab/Pixel-Reasoner
Pixel-Level Reasoning Model trained with RL [NeuIPS25] |
|
Emerging |
| 1322 |
cgbur/llama2.zig
Inference Llama 2 in one file of pure Zig |
|
Emerging |
| 1323 |
amithkoujalgi/ollama-pdf-bot
A bot that accepts PDF docs and lets you ask questions on it. |
|
Emerging |
| 1324 |
Fsoft-AIC/Grasp-Anything
Dataset and Code for ICRA 2024 paper "Grasp-Anything: Large-scale Grasp... |
|
Emerging |
| 1325 |
r1cc4rd0m4zz4/traNsLatorLaB
translatorlab: a machine translation tool that uses artificial intelligence... |
|
Emerging |
| 1326 |
OpenLemur/Lemur
[ICLR 2024] Lemur: Open Foundation Models for Language Agents |
|
Emerging |
| 1327 |
thevasudevgupta/bigbird
Google's BigBird (Jax/Flax & PyTorch) @ 🤗Transformers |
|
Emerging |
| 1328 |
boheumd/MA-LMM
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term... |
|
Emerging |
| 1329 |
young-geng/m3ae_public
Multimodal Masked Autoencoders (M3AE): A JAX/Flax Implementation |
|
Emerging |
| 1330 |
YJiangcm/Lion
[EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models |
|
Emerging |
| 1331 |
wellcometrust/WellcomeML
Retired repository for Machine Learning utils at the Wellcome Trust (now deprecated). |
|
Emerging |
| 1332 |
slp-rl/slamkit
SlamKit is an open source tool kit for efficient training of SpeechLMs. It... |
|
Emerging |
| 1333 |
pabroux/llm-engineers-handbook
LLM Engineer's Handbook by Paul Iusztin and Maxime Labonne with pixi. |
|
Emerging |
| 1334 |
kyegomez/Fusion3D
An extremely experimental model that intakes images and generates 3D scenes... |
|
Emerging |
| 1335 |
midway2333/Tower2
多模态语言模型架构 |
|
Emerging |
| 1336 |
sanjaradylov/smiles-gpt
Generative Pre-Training from Molecules |
|
Emerging |
| 1337 |
fran-martinez/bio_ner_bert
BERT finetuned on NER downstream tasks |
|
Emerging |
| 1338 |
ChenRocks/UNITER
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt... |
|
Emerging |
| 1339 |
IDSIA/automated-cl
Official repository for the paper "Automating Continual Learning" |
|
Emerging |
| 1340 |
kehanlu/DeSTA2
Code and model for ICASSP 2025 Paper "Developing Instruction-Following... |
|
Emerging |
| 1341 |
AlekseyKorshuk/huggingartists
Lyrics generation with GPT2-based Transformer |
|
Emerging |
| 1342 |
its-kumar-yash/deep-study-ai-agent
DeepStudy AI automates research, refines queries dynamically, and generates... |
|
Emerging |
| 1343 |
Shannon-Labs/shannon-control-unit
Shannon Control Unit: Adaptive regularization via control theory for LLM training |
|
Emerging |
| 1344 |
NimbleEdge/sparse_transformers
Sparse Inferencing for transformer based LLMs |
|
Emerging |
| 1345 |
cyk1337/Transformer-in-PyTorch
Transformer/Transformer-XL/R-Transformer examples and explanations |
|
Emerging |
| 1346 |
clovaai/length-adaptive-transformer
Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021) |
|
Emerging |
| 1347 |
kyegomez/SSM-As-VLM-Bridge
An exploration into leveraging SSM's as Bridge/Adapter Layers for VLM |
|
Emerging |
| 1348 |
ylsung/VL_adapter
PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for... |
|
Emerging |
| 1349 |
DeepChainBio/deepchain-apps
A library for deploying App on deepchain.bio |
|
Emerging |
| 1350 |
TIGER-AI-Lab/Vamba
Code for the paper "Vamba: Understanding Hour-Long Videos with Hybrid... |
|
Emerging |
| 1351 |
praeclarum/transformers-js
Browser-compatible JS library for running language models |
|
Emerging |
| 1352 |
misonsky/HiFT
memory-efficient fine-tuning; support 24G GPU memory fine-tuning 7B |
|
Emerging |
| 1353 |
xenova/sponsorblock-ml
Automatically detect in-video YouTube sponsorships, self/unpaid promotions,... |
|
Emerging |
| 1354 |
soumyadip1995/BabyGPT
Something in the middle of Karpathy's mingpt model and video lectures, ... |
|
Emerging |
| 1355 |
shrut2702/upasak
UI-based Fine-Tuning for Large Language Models (LLMs) |
|
Emerging |
| 1356 |
Sea-Snell/JAXSeq
Train very large language models in Jax. |
|
Emerging |
| 1357 |
WayneJin0918/SRUM
Official repo of paper "SRUM: Fine-Grained Self-Rewarding for Unified... |
|
Emerging |
| 1358 |
James-QiuHaoran/LLM-serving-with-proxy-models
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length... |
|
Emerging |
| 1359 |
zinengtang/TVLT
PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral) |
|
Emerging |
| 1360 |
naokishibuya/simple_transformer
A Transformer Implementation that is easy to understand and customizable. |
|
Emerging |
| 1361 |
lqzxt/Time-R1
Time-R1 is a two-stage reinforcement fine-tuning framework that trains large... |
|
Emerging |
| 1362 |
IDSIA/lmtool-fwp
PyTorch Language Modeling Toolkit for Fast Weight Programmers |
|
Emerging |
| 1363 |
NVlabs/Long-RL
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025) |
|
Emerging |
| 1364 |
UCSC-VLAA/m1
[ML4H'25] m1: Unleash the Potential of Test-Time Scaling for Medical... |
|
Emerging |
| 1365 |
mybigday/llama.node
Node.js binding of llama.cpp |
|
Emerging |
| 1366 |
amazon-science/text_generation_diffusion_llm_topic
Topic Embedding, Text Generation and Modeling using diffusion |
|
Emerging |
| 1367 |
hpretila/llama.net
.NET wrapper for LLaMA.cpp for LLaMA language model inference on CPU. 🦙 |
|
Emerging |
| 1368 |
pranavkumaarofficial/nlcli-wizard
Natural language control for Python CLI tools using locally-trained SLMs... |
|
Emerging |
| 1369 |
GunjanDhanuka/stocks-trading-bot
A multi-purpose repository with Sentiment Analysis of Stocks news, and... |
|
Emerging |
| 1370 |
DAMO-NLP-SG/CLEX
[ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models |
|
Emerging |
| 1371 |
MURUGESAN88709/mental-health-finetuned-llama
🧠 Fine-tune LLaMA for mental health applications, providing insights and... |
|
Emerging |
| 1372 |
lamalab-org/MatText
Text-based modeling of materials. |
|
Emerging |
| 1373 |
zjunlp/Deco
[ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation |
|
Emerging |
| 1374 |
amazon-science/crossmodal-contrastive-learning
CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video... |
|
Emerging |
| 1375 |
kreasof-ai/OpenFormer
A hackable library for running and fine-tuning modern transformer models on... |
|
Emerging |
| 1376 |
lvyufeng/cybertron-ai
mindspore implementation of transformers |
|
Emerging |
| 1377 |
belladoreai/llama-tokenizer-js
JS tokenizer for LLaMA 1 and 2 |
|
Emerging |
| 1378 |
sugarme/transformer
NLP transformers written in Go |
|
Emerging |
| 1379 |
rafiepour/CTran
Complete code for the proposed CNN-Transformer model for natural language... |
|
Emerging |
| 1380 |
qizekun/ShapeLLM
[ECCV 2024] ShapeLLM: Universal 3D Object Understanding for Embodied Interaction |
|
Emerging |
| 1381 |
zabir-nabil/awesome-multilingual-large-language-models
A comprehensive collection of multilingual datasets and large language... |
|
Emerging |
| 1382 |
mdegans/drama_llama
Yet another `llama.cpp` Rust wrapper |
|
Emerging |
| 1383 |
akx/ollama-dl
Download models from the Ollama library, without Ollama |
|
Emerging |
| 1384 |
gopikrsmscs/stock-price-prediction-transformer
Tesal Stock Price Prediction Using Transformer |
|
Emerging |
| 1385 |
Jackksonns/CoVALend
CoVALend: a compliance-aware micro-lending default prediction pipeline with... |
|
Emerging |
| 1386 |
liuyukid/transformers-ner
Pytorch-Named-Entity-Recognition-with-transformers |
|
Emerging |
| 1387 |
Geotrend-research/smaller-transformers
Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0. |
|
Emerging |
| 1388 |
rasbt/dora-from-scratch
LoRA and DoRA from Scratch Implementations |
|
Emerging |
| 1389 |
kyegomez/AoA-torch
Implementation of Attention on Attention in Zeta |
|
Emerging |
| 1390 |
LLMBook-zh/LLMBook-zh.github.io
《大语言模型》作者:赵鑫,李军毅,周昆,唐天一,文继荣 |
|
Emerging |
| 1391 |
riccardomusmeci/mlx-llm
Large Language Models (LLMs) applications and tools running on Apple Silicon... |
|
Emerging |
| 1392 |
THUDM/LongCite
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA |
|
Emerging |
| 1393 |
mohd-faizy/06P_Sentiment-Analysis-With-Deep-Learning-Using-BERT
Finetuning BERT in PyTorch for sentiment analysis. |
|
Emerging |
| 1394 |
shm007g/LLaMA-Cult-and-More
Large Language Models for All, 🦙 Cult and More, Stay in touch ! |
|
Emerging |
| 1395 |
golololologol/LLM-Distillery
A pipeline for LLM knowledge distillation |
|
Emerging |
| 1396 |
neuralwork/instruct-finetune-mistral
Fine-tune Mistral 7B to generate fashion style suggestions |
|
Emerging |
| 1397 |
datawhalechina/llm-deploy
大模型/LLM推理和部署理论与实践 |
|
Emerging |
| 1398 |
Aaronhuang-778/BiLLM
[ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs |
|
Emerging |
| 1399 |
anchen1011/FireAct
FireAct: Toward Language Agent Fine-tuning |
|
Emerging |
| 1400 |
EricLBuehler/xlora
X-LoRA: Mixture of LoRA Experts |
|
Emerging |