All Transformer Models

7,795 models ranked by quality score · Page 20 of 78

Showing 1901–2000 of 7,795
# Model Score Tier
1901 sh0416/llama-classification

Text classification with Foundation Language Model LLaMA

36
Emerging
1902 declare-lab/LLM-PuzzleTest

This repository is maintained to release dataset and models for multimodal...

36
Emerging
1903 jeffreysijuntan/lloco

The official repo for "LLoCo: Learning Long Contexts Offline"

36
Emerging
1904 R-D-BioTech-Alaska/Brain

Brain is an innovative concept that combines Qelm with Nueron to harness the...

36
Emerging
1905 HenryHZY/Awesome-Multimodal-LLM

Research Trends in LLM-guided Multimodal Learning.

36
Emerging
1906 vorobeevich/ml-snippets-classification

The source code of "Machine learning code snippets semantic classification"...

36
Emerging
1907 pleisto/yuren-baichuan-7b

基于baichuan-7b的开源多模态大语言模型

36
Emerging
1908 surrey-nlp/PLOD-AbbreviationDetection

This repository contains the PLOD Dataset for Abbreviation Detection...

36
Emerging
1909 rese1f/aurora

[ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a...

36
Emerging
1910 TIGER-AI-Lab/MAmmoTH

Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid...

36
Emerging
1911 zhenyi4/ssa

Official repository for "SSA: Sparse Sparse Attention by Aligning Full and...

36
Emerging
1912 ChuloAI/BrainChulo

Harnessing the Memory Power of the Camelids

36
Emerging
1913 SeekingDream/DyCodeEval

Official repository of the ICML2025 paper “Dynamic Benchmarking of Reasoning...

36
Emerging
1914 diogok/llama.cpp.zig

A build.zig for llama.cpp

36
Emerging
1915 RhinoDevel/mt_llm

Pure C wrapper library to use llama.cpp with Linux and Windows as simple as...

36
Emerging
1916 jaketae/param-share-transformer

PyTorch implementation of Lessons on Parameter Sharing across Layers in Transformers

36
Emerging
1917 jordandeklerk/SwinViT

Modified Swin Transformer model in PyTorch on CIFAR-10 for image classification

36
Emerging
1918 Am1n3e/active-learning-transformer

A hands-on tutorial on how to use Active Learning with Transformer models.

36
Emerging
1919 frankaging/ReCOGS

ReCOGS: How Incidental Details of a Logical Form Overshadow an Evaluation of...

36
Emerging
1920 GiannakopoulosIlias/vision-transformer-network-for-mr-electrical-properties-tomography

A 3D Vision Transformer-based neural network for reconstructing electrical...

36
Emerging
1921 SkyworkAI/MoE-plus-plus

[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with...

36
Emerging
1922 yinboc/trans-inr

Transformers as Meta-Learners for Implicit Neural Representations, in ECCV 2022

36
Emerging
1923 mrdbourke/mac-ml-speed-test

A few quick scripts focused on testing TensorFlow/PyTorch/Llama 2 on macOS.

36
Emerging
1924 avocardio/Zicklein

Finetuning instruct-LLaMA on german datasets.

36
Emerging
1925 alphasecio/llama-guard

A web app for exploring content moderation with Llama Guard on Groq.

36
Emerging
1926 m0dulo/InferSpore

🌱 A fully independent Large Language Model (LLM) inference engine, built...

36
Emerging
1927 lechmazur/writing

This benchmark tests how well LLMs incorporate a set of 10 mandatory story...

36
Emerging
1928 general-preference/general-preference-model

[ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for...

36
Emerging
1929 dreamingjudith/KoGPT2-personachat

Fine-tuned KoGPT2 chatbot demo with translated PersonaChat (ongoing)

36
Emerging
1930 Relaxed-System-Lab/Flash-Sparse-Attention

🚀🚀 Efficient implementations of Native Sparse Attention

36
Emerging
1931 bnosac/golgotha

Contextualised Embeddings and Language Modelling using BERT and Friends using R

36
Emerging
1932 dev-sufyaan/Nexlify

Unified API platform for free access to enterprise-grade AI models from...

36
Emerging
1933 modal-labs/stopwatch

A tool for benchmarking LLMs on Modal

36
Emerging
1934 xuyang-liu16/GlobalCom2

[AAAI 2026] Global Compression Commander: Plug-and-Play Inference...

36
Emerging
1935 AIFrameResearch/SPO

Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL...

36
Emerging
1936 yifanzhang-pro/HLA

Official Project Page for HLA: Higher-order Linear Attention...

36
Emerging
1937 FengheTan9/LLM4Seg

[MICCAI 2025] Official code for "Pre-Trained LLM is a Semantic-Aware and...

36
Emerging
1938 jqtangust/Robust-R1

🔥🔥🔥[AAAI 2026 Oral] Official Implementation of Robust-R1: Degradation-Aware...

36
Emerging
1939 FSoft-AI4Code/CodeCapybara

Open-source Self-Instruction Tuning Code LLM

36
Emerging
1940 moeru-ai/demodel

🚀🛸 Easily boost the speed of pulling your models and datasets from various...

36
Emerging
1941 NVlabs/RocketKV

[ICML 2025] RocketKV: Accelerating Long-Context LLM Inference via Two-Stage...

36
Emerging
1942 dobriban/Principles-of-AI-LLMs

Materials for the course Principles of AI: LLMs at UPenn (Stat 9911, Spring...

36
Emerging
1943 BatsResearch/trove

A Flexible Toolkit for Dense Retrieval

36
Emerging
1944 twiecki/transpailer

LLM-based, self-correcting transpiler, supports Jax, PyToch, Rust, PyMC, Stan

36
Emerging
1945 cankocagil/SwinDetr

Integration of Swin Transformer to DETR for Robust Object Detection (DEMO)

36
Emerging
1946 retarfi/language-pretraining

Pre-training Language Models for Japanese

36
Emerging
1947 abcsys/libem

Compound AI toolchain for fast and accurate entity matching, powered by LLMs.

36
Emerging
1948 devdhananjay14/multim

🔍 Experiment with neural networks for binary classification on multimodal...

36
Emerging
1949 harryjdavies/HeartGPT

Interpretable Pre-Trained Transformers for Heart Time-Series Data

36
Emerging
1950 kaist-cvml/I-HallA-v1.0

[AAAI 2025] Official Implementation of I-HallA v1.0

36
Emerging
1951 shahrukhx01/siamese-nn-semantic-text-similarity

A repository containing comprehensive Neural Networks based PyTorch...

36
Emerging
1952 kaistAI/Janus

[NeurIPS 2024] Train LLMs with diverse system messages reflecting...

36
Emerging
1953 shahriargolchin/time-travel-in-llms

The official repository for the paper entitled "Time Travel in LLMs: Tracing...

36
Emerging
1954 chaitjo/gated-graph-transformers

Transformers are Graph Neural Networks!

36
Emerging
1955 tlc4418/llm_optimization

A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.

36
Emerging
1956 Samyak-777/nomodel

The world's most accurate LLM. It achieves 0% hallucination rate by...

36
Emerging
1957 gtausa197-svg/-Project-Nord-Spiking-Neural-Network-Language-Model

The first pure SNN language model trained from scratch with a fully original...

36
Emerging
1958 msakarvadia/memorization

Localizing Memorized Sequences in Language Models

36
Emerging
1959 kyegomez/PALI

Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"

36
Emerging
1960 abhisheknair10/llama3.cu

Lightweight Llama 3 8B Inference Engine in CUDA C

36
Emerging
1961 HKUDS/SepLLM

[ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One...

36
Emerging
1962 Mmorgan-ML/Neuromodulatory-Control-Networks

Neuromodulatory Control Networks (NCNs), a novel LLM architectural...

36
Emerging
1963 Anshita1Saxena/transformer_time_series_forecasting

Transformers applied on Time Series Forecasting

35
Emerging
1964 Jathurshan0330/Cross-Modal-Transformer

Official repository of cross-modal transformer for interpretable automatic...

35
Emerging
1965 yashbonde/rasp

Implementing RASP transformer programming language...

35
Emerging
1966 llcuda/llcuda

CUDA 12-first backend inference for Unsloth on Kaggle — Optimized for small...

35
Emerging
1967 vipulraheja/iterater

Official implementation of the paper "IteraTeR: Understanding Iterative...

35
Emerging
1968 nikolaydubina/llama2.go

LLaMA-2 in native Go

35
Emerging
1969 lfunderburk/automate-tech-post

LLM application: fine tuned model to generate social media posts from...

35
Emerging
1970 andrewliao11/LongPerceptualThoughts

[COLM'25] The official implementation of "LongPerceptualThoughts: Distilling...

35
Emerging
1971 vbdi/divprune

[CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large...

35
Emerging
1972 oshindutta/TVAprune

[ICML 2024 Es-FoMo] - Efficient LLM Pruning with Global Token-Dependency...

35
Emerging
1973 arshadshk/SAINT-pytorch

SAINT PyTorch implementation

35
Emerging
1974 uncbiag/Awesome-Foundation-Models

A curated list of foundation models for vision and language tasks

35
Emerging
1975 Cryolite/kanachan

A Japanese (Riichi) Mahjong AI Framework

35
Emerging
1976 palonso/MAEST

Pre-training, fine-tuning, and inference code with the MAEST models for...

35
Emerging
1977 automorphic-ai/trex

Enforce structured output from LLMs 100% of the time

35
Emerging
1978 vietanhdev/llama-assistant-train

Training Scripts for Llama Assistant: Your Local AI Assistant That Respects...

35
Emerging
1979 TayeeChang/keras_transformers

the implement of transformer family such as bert, alber, roberta, nezha, etc.

35
Emerging
1980 alibaba/easydist

Automated Parallelization System and Infrastructure for Multiple Ecosystems

35
Emerging
1981 wxjiao/ParroT

The ParroT framework to enhance and regulate the Translation Abilities...

35
Emerging
1982 Ankur3107/nlp_notebooks

Tensorflow, Pytorch, Huggingface Transformer, Fastai, etc. tutorial Colab Notebooks.

35
Emerging
1983 kyegomez/MambaDecoderBlock

MambaDecoderBlock is a novel decoder architecture that replaces traditional...

35
Emerging
1984 Airmomo/transformers-docs-zh

【持续更新中】 完全中文版的 Transformers 学习笔记及演示示例,支持 Jupyter Notebook,主要内容来自 🤗 Hugging...

35
Emerging
1985 EvanZhouDev/llm.pdf

Run LLMs inside a PDF file.

35
Emerging
1986 TideDra/VL-RLHF

A RLHF Infrastructure for Vision-Language Models

35
Emerging
1987 OnlyTerp/turboquant

First open-source implementation of Google TurboQuant (ICLR 2026) --...

35
Emerging
1988 wangcongcong123/ttt

A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+

35
Emerging
1989 astrobleem/Simple-StableLM-Chat

This is a very simple python app that you can use to get up and chatting...

35
Emerging
1990 uakarsh/latr

Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel...

35
Emerging
1991 tranquoctrinh/transformer

This is a PyTorch implementation of the Transformer model in the paper...

35
Emerging
1992 hoof-ai/hoof

"Just hoof it!" - A spotlight like interface to Ollama

35
Emerging
1993 ntt-dkiku/route-explainer

The official implementation of "RouteExplainer: An Explanation Framework for...

35
Emerging
1994 Mya-Mya/CBF-LLM

"CBF-LLM: Safe Control for LLM Alignment"

35
Emerging
1995 BaohaoLiao/RSD

[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and...

35
Emerging
1996 sail-sg/Attention-Sink

[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical...

35
Emerging
1997 BUAADreamer/SPN4CIR

[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning...

35
Emerging
1998 OatmealLiu/FineR

[ICLR'24] Democratizing Fine-grained Visual Recognition with Large Language Models

35
Emerging
1999 frankluise5220/ComfyUI-Lorahelper

A professional automation toolkit for ComfyUI to prepare LoRA training data...

35
Emerging
2000 CogitoNTNU/course-on-large-language-models

This is a course on how to to program with Large Language Models.

35
Emerging
« Prev 1 2 3 18 19 20 21 22 76 77 78 Next »