All Transformer Models

7,795 models ranked by quality score · Page 22 of 78

Showing 2101–2200 of 7,795
# Model Score Tier
2101 dmis-lab/Monet

[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers

35
Emerging
2102 AdrianBZG/LLM-distributed-finetune

Tune efficiently any LLM model from HuggingFace using distributed training...

35
Emerging
2103 lucasjinreal/Namo-R1

A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from...

35
Emerging
2104 wehos/awesome-graph-transformer

Papers about graph transformers.

35
Emerging
2105 logic-OT/BobVLM

BobVLM – A 1.5B multimodal model built from scratch and pre-trained on a...

35
Emerging
2106 ExplainableML/Vision_by_Language

[ICLR 2024] Official repository for "Vision-by-Language for Training-Free...

35
Emerging
2107 daviden1013/llm-ie

A comprehensive toolkit that provides building blocks for LLM-based named...

35
Emerging
2108 ExplainableML/WaffleCLIP

Official repository for the ICCV 2023 paper: "Waffling around for...

35
Emerging
2109 YeonwooSung/vision-search

Image search engine

34
Emerging
2110 TencentARC/ST-LLM

[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language...

34
Emerging
2111 eliahuhorwitz/Spectral-DeTuning

Official PyTorch Implementation for the "Recovering the Pre-Fine-Tuning...

34
Emerging
2112 davide-coccomini/MINTIME-Multi-Identity-size-iNvariant-TIMEsformer-for-Video-Deepfake-Detection

Code for Video Deepfake Detector from "MINTIME: Multi-Identity...

34
Emerging
2113 DestroyerDarkNess/fastvlm-webgpu

Real-time video captioning powered by FastVLM

34
Emerging
2114 EvilFreelancer/rugpt3-custom

Pre-training custom ruGPT3 model on books written by F.M. Dostoevski

34
Emerging
2115 cifkao/context-probing

Black-box language model explanation by context length probing

34
Emerging
2116 DCQN-axiomatics/DCQN-Matrix-Axiomatik-LLM-Protocol

A strict, deterministic LLM protocol for loading, reading and activating the...

34
Emerging
2117 monk1337/auto-ollama

run ollama & gguf easily with a single command

34
Emerging
2118 bloomberg/MixCE-acl2023

Implementation of MixCE method described in ACL 2023 paper by Zhang et al.

34
Emerging
2119 X-iZhang/CCD

📷 CCD: Mitigating Hallucinations in Radiology MLLMs via Clinical Contrastive...

34
Emerging
2120 henrikalbihn/gliner-as-a-service

GLiNER model in a FastAPI microservice.

34
Emerging
2121 pymc-labs/transpailer

LLM-based, self-correcting transpiler. Supports JAX, PyTorch, Rust, PyMC, Stan.

34
Emerging
2122 nareshis21/Truelarge-RT

Android inference engine running 20B+ parameter LLMs on 4GB-8GB RAM devices....

34
Emerging
2123 MNoorFawi/curlora

The code repository for the CURLoRA research paper. Stable LLM continual...

34
Emerging
2124 PathologyFoundation/plip

Pathology Language and Image Pre-Training (PLIP) is the first vision and...

34
Emerging
2125 Nikityyy/lille

A powerful 130-million-parameter model trained from scratch as part of a...

34
Emerging
2126 rezazad68/transdeeplab

TransDeepLab: Convolution-Free Transformer-based DeepLab v3+ for Medical...

34
Emerging
2127 aws-samples/sample-for-multi-modal-document-to-json-with-sagemaker-ai

This open-source project delivers a complete pipeline for converting...

34
Emerging
2128 will-thompson-k/tldr-transformers

The "tl;dr" on a few notable transformer papers (pre-2022).

34
Emerging
2129 Saivineeth147/llm-testlab

Comprehensive Testing Tool for Large Language Models

34
Emerging
2130 arcee-ai/PruneMe

Automated Identification of Redundant Layer Blocks for Pruning in Large...

34
Emerging
2131 SakanaAI/evo-memory

Code to train and evaluate Neural Attention Memory Models to obtain...

34
Emerging
2132 HaoAreYuDong/MachineLearningLM

Scaling In-context Learning from Few-shot to 1,024-shot on Tabular ML

34
Emerging
2133 ManashJKonwar/NLP-Transformers

Transformer (BERT, GPT2, etc.) based Training Module for popular NLP tasks

34
Emerging
2134 GithubX-F/DynaMO-RL

Dynamic Rollout Allocation and Advantage Modulation for Policy Optimization...

34
Emerging
2135 IDSIA/modern-srwm

Official repository for the paper "A Modern Self-Referential Weight Matrix...

34
Emerging
2136 tsinghua-fib-lab/ANeurIPS2024_SPV-MIA

[NeurIPS'24] "Membership Inference Attacks against Fine-tuned Large Language...

34
Emerging
2137 SolomonB14D3/knowledge-fidelity

Behavioral auditing & repair toolkit for LLMs. Measures 8 dimensions via...

34
Emerging
2138 amajee11us/TabGLM

[AAAI' 25] Tabular Graph-Text Representation Learning with Consistency Minimization

34
Emerging
2139 c00k1ez/plain-transformers

Transformer models implementation for training from scratch.

34
Emerging
2140 abacaj/transformers-docker

Run, build, test transformer models using docker

34
Emerging
2141 pymc-labs/transalchemy

LLM-based, self-correcting transpiler. Supports JAX, PyTorch, Rust, PyMC, Stan.

34
Emerging
2142 linonetwo/langchain-alpaca

Run Alpaca LLM in LangChain

34
Emerging
2143 gitctrlx/llama.cu

Llama from scratch in CUDA with Flash Attention.

34
Emerging
2144 CLAIRE-Labo/quantile-reward-policy-optimization

Official codebase for "Quantile Reward Policy Optimization: Alignment with...

34
Emerging
2145 CristianCristanchoT/chivito

Implementación de un LLM basado en Llama finetuneado en español empleando...

34
Emerging
2146 BubbleJoe-BrownU/TransformerHub

This is a repository of transformer-like models, including Transformer, GPT,...

34
Emerging
2147 kuvaus/llama-chat

Simple chat program for LLaMa models

34
Emerging
2148 AkiRusProd/numpy-transformer

A numpy implementation of the Transformer model in "Attention is All You Need"

34
Emerging
2149 Hon-Wong/VoRA

[Fully open] [Encoder-free MLLM] Vision as LoRA

34
Emerging
2150 cxcscmu/Montessori-Instruct

Official repository for Montessori-Instruct: Generate Influential Training...

34
Emerging
2151 vivy-yi/awesome-llm-training-inference

Curated list of LLM training and inference frameworks, tools, and resources....

34
Emerging
2152 an-yongqi/systematic-outliers

[ICLR 2025] Systematic Outliers in Large Language Models.

34
Emerging
2153 shahrukhx01/bert-probe

BERT Probe: A python package for probing attention based robustness to...

34
Emerging
2154 jorgemunozl/Finetunning-Llama-Vision-11b

Inference and finnetunning of a VLM (LLama Vision 11b) using the Unsloth,...

34
Emerging
2155 rasbt/gradient-accumulation-blog

Finetuning BLOOM on a single GPU using gradient-accumulation

34
Emerging
2156 theodo-group/GenossGPT

One API for all LLMs either Private or Public (Anthropic, Llama V2, GPT...

34
Emerging
2157 RahulSChand/gpu_poor

Calculate token/s & GPU memory requirement for any LLM. Supports...

34
Emerging
2158 antoninodimaggio/Hugging-Captions

Generate realistic Instagram captions using transformers 🤗

34
Emerging
2159 czg1225/CoDe

[CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive...

34
Emerging
2160 itsqyh/Awesome-LMMs-Mechanistic-Interpretability

A curated collection of resources focused on the Mechanistic...

34
Emerging
2161 jmnolte/HCCNet

Early prediction of liver cancer using longitudinal MRI

34
Emerging
2162 fermyon/ai-examples

A collection of serverless apps that show how Fermyon's Serverless AI...

34
Emerging
2163 Ramseths/app-llama2

Generative AI - LLaMA 2 7B & LangChain, to generate stories based on a genre.

34
Emerging
2164 forgi86/sysid-transformers

Code to reproduce the results of the paper In-context learning for...

34
Emerging
2165 chelsea0x3b/llama-dfdx

LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!

34
Emerging
2166 ausboss/Local-LLM-Langchain

Load local LLMs effortlessly in a Jupyter notebook for testing purposes...

34
Emerging
2167 leliuga/cohere-configurations

Co:Here Inference configurations

34
Emerging
2168 LuluW8071/Text-Sentiment-Analysis

Text Sentiment Analysis with RNNs Models + Additive Attention and Transformers

34
Emerging
2169 taishan1994/LLM-Quantization

记录量化LLM中的总结。

34
Emerging
2170 Zishan-Shao/FlashSVD

Welcome to the FlashSVD, an activation aware inference system for SVD-based...

34
Emerging
2171 iKernels/transformers-lightning

A collection of Models, Datasets, DataModules, Callbacks, Metrics, Losses...

34
Emerging
2172 datatrigger/nlp_hugging_face

Text classification with the transformers library from Hugging Face, by...

34
Emerging
2173 xmindflow/MMCFormer

[MIDL 2023] MMCFormer: Missing Modality Compensation Transformer for Brain...

34
Emerging
2174 dougeeai/llama-cpp-python-wheels

Pre-built wheels for llama-cpp-python across platforms and CUDA versions

34
Emerging
2175 damianoduranti/LLMknowextra

LLM-Driven Knowledge Extraction: Results in Temporal and Description Logics...

34
Emerging
2176 jakobtroidl/neuron-shape-reasoning

PyTorch Implementation of Global Neuron Shape Reasoning with Point Affinity...

34
Emerging
2177 RAHB-REALTORS-Association/email-autodrafts

Email Auto-ReplAI is a Python tool that uses AI to automate drafting...

34
Emerging
2178 ZJLAB-AMMI/LLM4Teach

Python code to implement LLM4Teach, a policy distillation approach for...

34
Emerging
2179 NiuTrans/LaMaTE

Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine...

34
Emerging
2180 Pengxin-Guo/FedSA-LoRA

Selective Aggregation for Low-Rank Adaptation in Federated Learning [ICLR 2025]

34
Emerging
2181 Tebmer/Awesome-Knowledge-Distillation-of-LLMs

This repository collects papers for "A Survey on Knowledge Distillation of...

34
Emerging
2182 ModelTC/QLLM

[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate...

34
Emerging
2183 dhruvdcoder/xlm-core

XLM is a modular, research-friendly framework for developing and comparing...

34
Emerging
2184 5aharsh/collama

Run Ollama LLM models in Google Colab for free

34
Emerging
2185 InternLM/OREAL

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

34
Emerging
2186 tojiboyevf/image_captioning

Deep Learning Final project 2022

34
Emerging
2187 dsindex/iclassifier

reference pytorch code for intent classification

34
Emerging
2188 liuqidong07/LEADER-pytorch

[arXiv'24] The official implementation code of LEADER.

34
Emerging
2189 forgi86/sysid-transformers-transfer

Code of the paper "On the adaptation of in-context learners for system...

34
Emerging
2190 AntonioGr7/pratical-llms

A collection of hand on notebook for LLMs practitioner

34
Emerging
2191 ksm26/Open-Source-Models-with-Hugging-Face

"Open Source Models with Hugging Face" course empowers you with the skills...

34
Emerging
2192 EasierMTL/chinese-translation-app

Chinese to English Translation Full Stack Web App + Automated Load Testing...

34
Emerging
2193 ictnlp/TruthX

Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large...

34
Emerging
2194 guyoung/AIMatrices

AIMatrices is a lightweight, high-performance, scalable, and open source AI...

34
Emerging
2195 GURPREETKAURJETHRA/Ollama-UseCases

This repo brings numerous use cases from the Open Source Ollama

34
Emerging
2196 ASSERT-KTH/repairllama

RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program...

34
Emerging
2197 haoliuhl/instructrl

Instruction Following Agents with Multimodal Transforemrs

34
Emerging
2198 WayneMao/RoboMatrix

The Official Implementation of RoboMatrix

34
Emerging
2199 Infini-AI-Lab/Sequoia

scalable and robust tree-based speculative decoding algorithm

34
Emerging
2200 mechramc/Orion

Local AI runtime for training & running small LLMs directly on Apple Neural...

34
Emerging
« Prev 1 2 3 20 21 22 23 24 76 77 78 Next »