All Transformer Models

7,795 models ranked by quality score · Page 9 of 78

Showing 801–900 of 7,795
# Model Score Tier
801 freshllms/freshqa

Data and code for FreshLLMs (https://arxiv.org/abs/2310.03214)

44
Emerging
802 yuchenlin/LLM-Blender

[ACL2023] We introduce LLM-Blender, an innovative ensembling framework to...

44
Emerging
803 Victorwz/LongMem

Official implementation of our NeurIPS 2023 paper "Augmenting Language...

44
Emerging
804 XXO47OXX/layer-scan

Automated LLM layer duplication config scanner — find the optimal (i,j) for...

44
Emerging
805 SkalskiP/vlms-zero-to-hero

This series will take you on a journey from the fundamentals of NLP and...

44
Emerging
806 joyehuang/minimind-notes

🚀 [从零构建 LLM] 极简大模型训练原理与实践指南。包含 Transformer, Pretraining, SFT 核心代码与对照实验。 | A...

44
Emerging
807 mdrokz/rust-llama.cpp

LLama.cpp rust bindings

44
Emerging
808 ariya/ask-llm

Interact with any LLM service

44
Emerging
809 Kaushalya/medclip

A multi-modal CLIP model trained on the medical dataset ROCO

44
Emerging
810 sgrvinod/chess-transformers

Teaching transformers to play chess

44
Emerging
811 4AI/LS-LLaMA

A Simple but Powerful SOTA NER Model | Official Code For Label Supervised...

44
Emerging
812 ictnlp/Stream-Omni

Stream-Omni is a GPT-4o-like language-vision-speech chatbot that...

44
Emerging
813 EncrEor/rlm-claude

Recursive Language Models for Claude Code - Infinite memory solution...

44
Emerging
814 vijaydwivedi75/gnn-lspe

Source code for GNN-LSPE (Graph Neural Networks with Learnable Structural...

44
Emerging
815 flipkart-incubator/spark-transformers

Spark-Transformers: Library for exporting Apache Spark MLLIB models to use...

44
Emerging
816 Arunprakash-A/DL-Pytorch-Workshop

Develop DL models using Pytorch and Hugging Face

44
Emerging
817 monologg/KoBERT-KorQuAD

Korean MRC (KorQuAD) with KoBERT

44
Emerging
818 alohays/awesome-visual-representation-learning-with-transformers

Awesome Transformers (self-attention) in Computer Vision

44
Emerging
819 1b5d/llm-api

Run any Large Language Model behind a unified API

44
Emerging
820 mbzuai-oryx/MobiLlama

[ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for...

44
Emerging
821 chengzeyi/ParaAttention

https://wavespeed.ai/ Context parallel attention that accelerates DiT model...

44
Emerging
822 amazon-science/tanl

Structured Prediction as Translation between Augmented Natural Languages

44
Emerging
823 GyanPrakashkushwaha/DataScience

EVERYTHING YOU NEED FOR DATA SCIENCE.

44
Emerging
824 amanvirparhar/weebo

A real-time speech-to-speech chatbot powered by Whisper Small, Llama 3.2,...

44
Emerging
825 donaldafeith/Pytorch_Merge

Merge LLM that are split in to parts

44
Emerging
826 absadiki/pyllamacpp

Python bindings for llama.cpp

44
Emerging
827 xinzhanguo/hellollm

pre train a new llm

44
Emerging
828 snap-research/EfficientFormer

EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]

44
Emerging
829 buaacyw/MeshAnythingV2

[ICCV 2025] From anything to mesh like human artists. Official impl. of...

44
Emerging
830 OPTML-Group/Unlearn-Simple

[NeurIPS25] Official repo for "Simplicity Prevails: Rethinking Negative...

44
Emerging
831 anseryuer/Local_LLM_Deployment_Guide_Chinese

本地部署大语言模型的中文教学

44
Emerging
832 invictus717/MetaTransformer

Meta-Transformer for Unified Multimodal Learning

44
Emerging
833 MIC-DKFZ/MedNeXt

[MICCAI 2023] MedNeXt is a fully ConvNeXt architecture for 3D medical image...

44
Emerging
834 RManLuo/reasoning-on-graphs

Official Implementation of ICLR 2024 paper: "Reasoning on Graphs: Faithful...

44
Emerging
835 dccuchile/beto

BETO - Spanish version of the BERT model

44
Emerging
836 JoaoLages/RATransformers

RATransformers 🐭- Make your transformer (like BERT, RoBERTa, GPT-2 and T5)...

44
Emerging
837 poloclub/llm-landscape

NeurIPS'24 - LLM Safety Landscape

44
Emerging
838 gaussalgo/adaptor

ACL 2022: Adaptor: a library to easily adapt a language model to your own...

44
Emerging
839 yesbhautik/Talk-with-PDF

An interactive AI chatbot for querying and discussing the contents of PDF...

44
Emerging
840 jerry1993-tech/Cornucopia-LLaMA-Fin-Chinese

聚宝盆(Cornucopia):...

44
Emerging
841 virtualramblas/Domain-Specific-Small-Language-Models

Repository for the companion Colab notebook of the Domain-Specific Small...

44
Emerging
842 ckiplab/ckip-transformers

CKIP Transformers

44
Emerging
843 HUST-NingKang-Lab/MGM

MGM (Microbial General Model) as a large-scaled pretrained language model...

44
Emerging
844 zhudotexe/fanoutqa

Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering...

44
Emerging
845 git-disl/Vaccine

This is the official code for the paper "Vaccine: Perturbation-aware...

44
Emerging
846 SakanaAI/text-to-lora

Hypernetworks that adapt LLMs for specific benchmark tasks using only...

44
Emerging
847 iaalm/llama-api-server

A OpenAI API compatible REST server for llama.

44
Emerging
848 jianzhnie/LLamaTuner

Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen,...

44
Emerging
849 uclaml/SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

44
Emerging
850 rkansal47/MPGAN

The message passing GAN https://arxiv.org/abs/2106.11535 and generative...

44
Emerging
851 JAMESYJL/ShapeLLM-Omni

[NeurIPS 2025 Spotlight] A Native Multimodal LLM for 3D Generation and Understanding

44
Emerging
852 linjieli222/HERO

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for...

44
Emerging
853 FMInference/FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

44
Emerging
854 AI-Hypercomputer/jetstream-pytorch

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream)...

44
Emerging
855 sagorbrur/bntransformer

Bengali transformer using transformers

44
Emerging
856 bytedance/effective_transformer

Running BERT without Padding

44
Emerging
857 google-research/long-range-arena

Long Range Arena for Benchmarking Efficient Transformers

44
Emerging
858 xianglin226/Benchmarking-Single-Cell-Perturbation

Single-Cell (Perturbation) Model Library

44
Emerging
859 0hq/WebGPT

Run GPT model on the browser with WebGPU. An implementation of GPT inference...

44
Emerging
860 kamalkraj/e5-mistral-7b-instruct

Finetune mistral-7b-instruct for sentence embeddings

44
Emerging
861 IntelLabs/causality-lab

Causal discovery algorithms and tools for implementing new ones

44
Emerging
862 backprop-ai/backprop

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

44
Emerging
863 pytorch/torchchat

Run PyTorch LLMs locally on servers, desktop and mobile

44
Emerging
864 salesforce/ETSformer

PyTorch code for ETSformer: Exponential Smoothing Transformers for...

44
Emerging
865 LibreTranslate/Locomotive

Toolkit for training/converting LibreTranslate compatible language models 🚂

44
Emerging
866 spcl/x1

Official Implementation of "Reasoning Language Models: A Blueprint"

44
Emerging
867 hao-ai-lab/Dynasor

[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model...

44
Emerging
868 bhavsarpratik/easy-transformers

Utility functions to work with transformers

44
Emerging
869 gluonfield/enchanted

Enchanted is iOS and macOS app for chatting with private self hosted...

44
Emerging
870 thuml/AutoTimes

Official implementation for "AutoTimes: Autoregressive Time Series...

44
Emerging
871 rohan-paul/LLM-FineTuning-Large-Language-Models

LLM (Large Language Model) FineTuning

43
Emerging
872 salesforce/CodeTF

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

43
Emerging
873 kyegomez/SparseAttention

Pytorch Implementation of the sparse attention from the paper: "Generating...

43
Emerging
874 Emmi-AI/noether

Deep-learning framework for Engineering AI. Built on transformer building...

43
Emerging
875 InternLM/CapRL

[ICLR 2026] An official implementation of "CapRL: Stimulating Dense Image...

43
Emerging
876 Atome-FE/llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs,...

43
Emerging
877 snap-stanford/relgt

Relational Graph Transformer

43
Emerging
878 sinanuozdemir/oreilly-pytorch-dl

Code for Deep Learning for Modern AI

43
Emerging
879 tlkh/t2t-tuner

Convenient Text-to-Text Training for Transformers

43
Emerging
880 oValach/RailSafeNet

Repository of the paper: RailSafeNet: Visual Scene Understanding for Tram Safety

43
Emerging
881 iPieter/RobBERT

A Dutch RoBERTa-based language model

43
Emerging
882 ddzipp/AutoAudit

AutoAudit—— the LLM for Cyber Security 网络安全大语言模型

43
Emerging
883 ContextLab/llm-stylometry

LLM-based approach for distinguishing the writings of different authors.

43
Emerging
884 elicit/machine-learning-list

A curriculum for learning about foundation models, from scratch to the frontier

43
Emerging
885 JetRunner/BERT-of-Theseus

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT...

43
Emerging
886 gitabtion/SoftMaskedBert-PyTorch

🙈 An unofficial implementation of SoftMaskedBert based on huggingface/transformers.

43
Emerging
887 julienkay/com.doji.transformers

A Unity package to run pretrained transformer models with Unity Sentis

43
Emerging
888 ucbrise/graphtrans

Representing Long-Range Context for Graph Neural Networks with Global Attention

43
Emerging
889 bayesgroup/code_transformers

Empirical Study of Transformers for Source Code & A Simple Approach for...

43
Emerging
890 deep-diver/llamaduo

[ACL'25] Official Code for LlamaDuo: LLMOps Pipeline for Seamless Migration...

43
Emerging
891 IST-DASLab/marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up...

43
Emerging
892 salcc/QuantumTransformers

Quantum Transformers for High Energy Physics Analysis at the Large Hadron Collider

43
Emerging
893 MozerWang/AMPO

[ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents

43
Emerging
894 gupta-abhay/pytorch-vit

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

43
Emerging
895 princeton-nlp/SimPO

[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward

43
Emerging
896 turtlesoupy/this-word-does-not-exist

This Word Does Not Exist

43
Emerging
897 sinanuozdemir/oreilly-huggingface-tour

A Crash Course in Hugging Face

43
Emerging
898 PureBee/purebee

A GPU defined in software. Runs Llama 3.2 1B at 3.6 tok/sec. Zero dependencies.

43
Emerging
899 Kartik-3004/SegFace

[AAAI 25] SegFace: Face Segmentation of Long-tail classes

43
Emerging
900 kevinMEH/keyscan

Keyscan: AI-powered API key scanner for GitHub Gists.

43
Emerging
« Prev 1 2 3 7 8 9 10 11 76 77 78 Next »