All Transformer Models

7,795 models ranked by quality score · Page 14 of 78

Showing 1301–1400 of 7,795
# Model Score Tier
1301 laugustyniak/LLM-demo-2-production

The set of tools and examples for creating LLM-based solutions from demo to...

40
Emerging
1302 alexrozanski/LlamaChat

Chat with your favourite LLaMA models in a native macOS app

40
Emerging
1303 ariannamethod/ariannamethod.ai

Arianna Method Programming Language

40
Emerging
1304 souzatharsis/tamingLLMs

Taming LLMs: A Practical Guide to LLM Pitfalls with Open Source Software

40
Emerging
1305 huggingface/datablations

Scaling Data-Constrained Language Models

40
Emerging
1306 QwenLM/Qwen2.5-Math

A series of math-specific large language models of our Qwen2 series.

40
Emerging
1307 ariannamethod/arianna.c

Arianna is a Digital Persona. Embodied cognition as is.

40
Emerging
1308 snktshrma/ngps_flight

Global vision positioning system for UAVs in outdoor GNSS-denied environments

40
Emerging
1309 cooelf/AwesomeMRC

IJCAI 2021 Tutorial & code for Retrospective Reader for Machine Reading...

40
Emerging
1310 HiThink-Research/MME-Finance

[MM 2025] A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning

40
Emerging
1311 urmzd/md-classifier

A deep learning system combining transformers and CNNs to classify diseases...

40
Emerging
1312 hanouticelina/deformable-DETR

Implementation of the paper : Deformable DETR: Deformable Transformers for...

40
Emerging
1313 abhilash1910/LongPegasus

LongPegasus package is used for inducing longformer self attention over base...

40
Emerging
1314 loong64/llama.cpp

LLM inference in C/C++

40
Emerging
1315 jhcho99/GSRTR

[BMVC'21] Official PyTorch Implementation of "Grounded Situation Recognition...

40
Emerging
1316 FoundationVision/UniTok

[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding

40
Emerging
1317 zeyadusf/LLMs-from-Scratch

Build a Large Language Model (From Scratch) book and Finetuned Models

40
Emerging
1318 rishikksh20/CrossViT-pytorch

Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer...

40
Emerging
1319 OpenBMB/InfiniteBench

Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K...

40
Emerging
1320 innightwolfsleep/old_llm_telegram_bot

Connect llama-cpp, transformers or text-generation-webui to telegram bot api.

40
Emerging
1321 TIGER-AI-Lab/Pixel-Reasoner

Pixel-Level Reasoning Model trained with RL [NeuIPS25]

40
Emerging
1322 cgbur/llama2.zig

Inference Llama 2 in one file of pure Zig

40
Emerging
1323 amithkoujalgi/ollama-pdf-bot

A bot that accepts PDF docs and lets you ask questions on it.

40
Emerging
1324 Fsoft-AIC/Grasp-Anything

Dataset and Code for ICRA 2024 paper "Grasp-Anything: Large-scale Grasp...

40
Emerging
1325 r1cc4rd0m4zz4/traNsLatorLaB

translatorlab: a machine translation tool that uses artificial intelligence...

40
Emerging
1326 OpenLemur/Lemur

[ICLR 2024] Lemur: Open Foundation Models for Language Agents

40
Emerging
1327 thevasudevgupta/bigbird

Google's BigBird (Jax/Flax & PyTorch) @ 🤗Transformers

40
Emerging
1328 boheumd/MA-LMM

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term...

40
Emerging
1329 young-geng/m3ae_public

Multimodal Masked Autoencoders (M3AE): A JAX/Flax Implementation

39
Emerging
1330 YJiangcm/Lion

[EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models

39
Emerging
1331 wellcometrust/WellcomeML

Retired repository for Machine Learning utils at the Wellcome Trust (now deprecated).

39
Emerging
1332 slp-rl/slamkit

SlamKit is an open source tool kit for efficient training of SpeechLMs. It...

39
Emerging
1333 pabroux/llm-engineers-handbook

LLM Engineer's Handbook by Paul Iusztin and Maxime Labonne with pixi.

39
Emerging
1334 kyegomez/Fusion3D

An extremely experimental model that intakes images and generates 3D scenes...

39
Emerging
1335 midway2333/Tower2

多模态语言模型架构

39
Emerging
1336 sanjaradylov/smiles-gpt

Generative Pre-Training from Molecules

39
Emerging
1337 fran-martinez/bio_ner_bert

BERT finetuned on NER downstream tasks

39
Emerging
1338 ChenRocks/UNITER

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt...

39
Emerging
1339 IDSIA/automated-cl

Official repository for the paper "Automating Continual Learning"

39
Emerging
1340 kehanlu/DeSTA2

Code and model for ICASSP 2025 Paper "Developing Instruction-Following...

39
Emerging
1341 AlekseyKorshuk/huggingartists

Lyrics generation with GPT2-based Transformer

39
Emerging
1342 its-kumar-yash/deep-study-ai-agent

DeepStudy AI automates research, refines queries dynamically, and generates...

39
Emerging
1343 Shannon-Labs/shannon-control-unit

Shannon Control Unit: Adaptive regularization via control theory for LLM training

39
Emerging
1344 NimbleEdge/sparse_transformers

Sparse Inferencing for transformer based LLMs

39
Emerging
1345 cyk1337/Transformer-in-PyTorch

Transformer/Transformer-XL/R-Transformer examples and explanations

39
Emerging
1346 clovaai/length-adaptive-transformer

Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)

39
Emerging
1347 kyegomez/SSM-As-VLM-Bridge

An exploration into leveraging SSM's as Bridge/Adapter Layers for VLM

39
Emerging
1348 ylsung/VL_adapter

PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for...

39
Emerging
1349 DeepChainBio/deepchain-apps

A library for deploying App on deepchain.bio

39
Emerging
1350 TIGER-AI-Lab/Vamba

Code for the paper "Vamba: Understanding Hour-Long Videos with Hybrid...

39
Emerging
1351 praeclarum/transformers-js

Browser-compatible JS library for running language models

39
Emerging
1352 misonsky/HiFT

memory-efficient fine-tuning; support 24G GPU memory fine-tuning 7B

39
Emerging
1353 xenova/sponsorblock-ml

Automatically detect in-video YouTube sponsorships, self/unpaid promotions,...

39
Emerging
1354 soumyadip1995/BabyGPT

Something in the middle of Karpathy's mingpt model and video lectures, ...

39
Emerging
1355 shrut2702/upasak

UI-based Fine-Tuning for Large Language Models (LLMs)

39
Emerging
1356 Sea-Snell/JAXSeq

Train very large language models in Jax.

39
Emerging
1357 WayneJin0918/SRUM

Official repo of paper "SRUM: Fine-Grained Self-Rewarding for Unified...

39
Emerging
1358 James-QiuHaoran/LLM-serving-with-proxy-models

Efficient Interactive LLM Serving with Proxy Model-based Sequence Length...

39
Emerging
1359 zinengtang/TVLT

PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)

39
Emerging
1360 naokishibuya/simple_transformer

A Transformer Implementation that is easy to understand and customizable.

39
Emerging
1361 lqzxt/Time-R1

Time-R1 is a two-stage reinforcement fine-tuning framework that trains large...

39
Emerging
1362 IDSIA/lmtool-fwp

PyTorch Language Modeling Toolkit for Fast Weight Programmers

39
Emerging
1363 NVlabs/Long-RL

Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)

39
Emerging
1364 UCSC-VLAA/m1

[ML4H'25] m1: Unleash the Potential of Test-Time Scaling for Medical...

39
Emerging
1365 mybigday/llama.node

Node.js binding of llama.cpp

39
Emerging
1366 amazon-science/text_generation_diffusion_llm_topic

Topic Embedding, Text Generation and Modeling using diffusion

39
Emerging
1367 hpretila/llama.net

.NET wrapper for LLaMA.cpp for LLaMA language model inference on CPU. 🦙

39
Emerging
1368 pranavkumaarofficial/nlcli-wizard

Natural language control for Python CLI tools using locally-trained SLMs...

39
Emerging
1369 GunjanDhanuka/stocks-trading-bot

A multi-purpose repository with Sentiment Analysis of Stocks news, and...

39
Emerging
1370 DAMO-NLP-SG/CLEX

[ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models

39
Emerging
1371 MURUGESAN88709/mental-health-finetuned-llama

🧠 Fine-tune LLaMA for mental health applications, providing insights and...

39
Emerging
1372 lamalab-org/MatText

Text-based modeling of materials.

39
Emerging
1373 zjunlp/Deco

[ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

39
Emerging
1374 amazon-science/crossmodal-contrastive-learning

CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video...

39
Emerging
1375 kreasof-ai/OpenFormer

A hackable library for running and fine-tuning modern transformer models on...

39
Emerging
1376 lvyufeng/cybertron-ai

mindspore implementation of transformers

39
Emerging
1377 belladoreai/llama-tokenizer-js

JS tokenizer for LLaMA 1 and 2

39
Emerging
1378 sugarme/transformer

NLP transformers written in Go

39
Emerging
1379 rafiepour/CTran

Complete code for the proposed CNN-Transformer model for natural language...

39
Emerging
1380 qizekun/ShapeLLM

[ECCV 2024] ShapeLLM: Universal 3D Object Understanding for Embodied Interaction

39
Emerging
1381 zabir-nabil/awesome-multilingual-large-language-models

A comprehensive collection of multilingual datasets and large language...

39
Emerging
1382 mdegans/drama_llama

Yet another `llama.cpp` Rust wrapper

39
Emerging
1383 akx/ollama-dl

Download models from the Ollama library, without Ollama

39
Emerging
1384 gopikrsmscs/stock-price-prediction-transformer

Tesal Stock Price Prediction Using Transformer

39
Emerging
1385 Jackksonns/CoVALend

CoVALend: a compliance-aware micro-lending default prediction pipeline with...

39
Emerging
1386 liuyukid/transformers-ner

Pytorch-Named-Entity-Recognition-with-transformers

39
Emerging
1387 Geotrend-research/smaller-transformers

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

39
Emerging
1388 rasbt/dora-from-scratch

LoRA and DoRA from Scratch Implementations

39
Emerging
1389 kyegomez/AoA-torch

Implementation of Attention on Attention in Zeta

39
Emerging
1390 LLMBook-zh/LLMBook-zh.github.io

《大语言模型》作者:赵鑫,李军毅,周昆,唐天一,文继荣

39
Emerging
1391 riccardomusmeci/mlx-llm

Large Language Models (LLMs) applications and tools running on Apple Silicon...

39
Emerging
1392 THUDM/LongCite

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

39
Emerging
1393 mohd-faizy/06P_Sentiment-Analysis-With-Deep-Learning-Using-BERT

Finetuning BERT in PyTorch for sentiment analysis.

39
Emerging
1394 shm007g/LLaMA-Cult-and-More

Large Language Models for All, 🦙 Cult and More, Stay in touch !

39
Emerging
1395 golololologol/LLM-Distillery

A pipeline for LLM knowledge distillation

39
Emerging
1396 neuralwork/instruct-finetune-mistral

Fine-tune Mistral 7B to generate fashion style suggestions

39
Emerging
1397 datawhalechina/llm-deploy

大模型/LLM推理和部署理论与实践

39
Emerging
1398 Aaronhuang-778/BiLLM

[ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

39
Emerging
1399 anchen1011/FireAct

FireAct: Toward Language Agent Fine-tuning

39
Emerging
1400 EricLBuehler/xlora

X-LoRA: Mixture of LoRA Experts

39
Emerging
« Prev 1 2 3 12 13 14 15 16 76 77 78 Next »