All Transformer Models
7,795 models ranked by quality score · Page 17 of 78
| # | Model | Score | Tier |
|---|---|---|---|
| 1601 |
WisconsinAIVision/ViP-LLaVA
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary... |
|
Emerging |
| 1602 |
EagleW/Scientific-Inspiration-Machines-Optimized-for-Novelty
Official implementation of the ACL 2024: Scientific Inspiration Machines... |
|
Emerging |
| 1603 |
HOLYKEYZ/model-unfetter
The production engine for directional ablation. Unalign / remove models... |
|
Emerging |
| 1604 |
NisaarAgharia/Indian-LawyerGPT
Fine-Tuning Falcon-7B, LLAMA 2 with QLoRA to create an advanced AI model... |
|
Emerging |
| 1605 |
Yxxxb/VoCo-LLaMA
[CVPR'2025] VoCo-LLaMA: This repo is the official implementation of... |
|
Emerging |
| 1606 |
rasbt/pytorch-memory-optim
This code repository contains the code used for my "Optimizing Memory Usage... |
|
Emerging |
| 1607 |
RishabSA/interp-refusal-tokens
We study whether categorical refusal tokens enable controllable and... |
|
Emerging |
| 1608 |
Jagatmohan46/tiny-recursive-model
🚀 Implement the Tiny Recursive Model (TRM) for improved performance in... |
|
Emerging |
| 1609 |
ParCIS/Chimera
Chimera: bidirectional pipeline parallelism for efficiently training... |
|
Emerging |
| 1610 |
hscspring/llama.np
Inference Llama/Llama2/Llama3 Modes in NumPy |
|
Emerging |
| 1611 |
BarCodeReader/SelfReformer
[TMM-2023] Official implementation of "Towards Complete and Detail-Preserved... |
|
Emerging |
| 1612 |
The-Swarm-Corporation/Hyena-Y
A PyTorch implementation of the Hyena-Y model, a convolution-based... |
|
Emerging |
| 1613 |
gnai-creator/aletheion-llm-v2
Decoder-only LLM with integrated epistemic tomography. Knows what it doesn't know. |
|
Emerging |
| 1614 |
readytensor/rt-llm-eng-cert-week3
Week 3 of LLM Engineering Certification: Learn to fine-tune large language... |
|
Emerging |
| 1615 |
ictnlp/BayLing
“百聆”是一个基于LLaMA的语言对齐增强的英语/中文大语言模型,具有优越的英语/中文能力,在多语言和通用任务等多项测试中取得ChatGPT... |
|
Emerging |
| 1616 |
ChanMeng666/interactive-story-generator
【Join our constellation of stargazers!⭐️】An interactive AI-powered story... |
|
Emerging |
| 1617 |
dropbox/grallama-panel
GraLLAMA panel for LLAMA data |
|
Emerging |
| 1618 |
matlab-deep-learning/transformer-networks-for-time-series-prediction
Deep Learning in Quantitative Finance: Transformer Networks for Time Series... |
|
Emerging |
| 1619 |
sshh12/llm_optimize
LLM Optimize is a proof-of-concept library for doing LLM (large language... |
|
Emerging |
| 1620 |
thruthseeker/LionLock_FDE_OSS
Open source fatigue detection engine for large language models with trust overlay |
|
Emerging |
| 1621 |
VITA-Group/LiGO
[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer... |
|
Emerging |
| 1622 |
mtuann/llm-updated-papers
Papers related to Large Language Models in all top venues |
|
Emerging |
| 1623 |
ariannamethod/doe
DoE Janus Architecture: Democracy of Experts |
|
Emerging |
| 1624 |
flozi00/atra
An open source NLP as a service project focused on providing state of the... |
|
Emerging |
| 1625 |
ximinng/LLM4SVG
[CVPR 2025] Official implementation for "Empowering LLMs to Understand and... |
|
Emerging |
| 1626 |
GT-RIPL/robo-vln
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics... |
|
Emerging |
| 1627 |
awinml/llama-cpp-python-bindings
Run fast LLM Inference using Llama.cpp in Python |
|
Emerging |
| 1628 |
litus-ai/classy
classy is a simple-to-use library for building high-performance Machine... |
|
Emerging |
| 1629 |
K024/llm-sharp
Language models in C# |
|
Emerging |
| 1630 |
coderonion/awesome-llm-and-aigc
🚀🚀🚀A collection of some awesome public projects about Large Language... |
|
Emerging |
| 1631 |
voidism/DoLa
Official implementation for the paper "DoLa: Decoding by Contrasting Layers... |
|
Emerging |
| 1632 |
declare-lab/exemplary-empathy
This repository contains the source codes of the paper -- Exemplars-guided... |
|
Emerging |
| 1633 |
ImKeTT/AdaVAE
[Preprint] AdaVAE: Exploring Adaptive GPT-2s in VAEs for Language Modeling... |
|
Emerging |
| 1634 |
TIGER-AI-Lab/VL-Rethinker
The official code of "VL-Rethinker: Incentivizing Self-Reflection of... |
|
Emerging |
| 1635 |
HKUNLP/icl-ceil
[ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”. |
|
Emerging |
| 1636 |
mohyunho/NAS_transformer
Evolutionary Neural Architecture Search on Transformers for RUL Prediction |
|
Emerging |
| 1637 |
DAMO-NLP-SG/LLM-Multilingual-Knowledge-Boundaries
[ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across... |
|
Emerging |
| 1638 |
casinca/LLM-quest
Verbose implementations of LLMs architectures, techniques and research... |
|
Emerging |
| 1639 |
CognitiveAISystems/RATE
[ICLR 2026] Official implementation of Recurrent Action Transformer with... |
|
Emerging |
| 1640 |
kyegomez/MGQA
The open source implementation of the multi grouped query attention by the... |
|
Emerging |
| 1641 |
joelbarmettlerUZH/ConceptFormer
Towards Finding the Essence of Everything in Large Language Models |
|
Emerging |
| 1642 |
iil-postech/semantic-attention
Official implementation of "Attention-aware semantic communications for... |
|
Emerging |
| 1643 |
pagraf/Seabed-Net
Quick start guide for Seabed-Net |
|
Emerging |
| 1644 |
Shanghai-Digital-Brain-Laboratory/BDM-DB1
A large-scale multi-modal pre-trained model |
|
Emerging |
| 1645 |
justADeni/intel-npu-llm
A simple Python script for running LLMs on Intel's Neural Processing Units (NPUs) |
|
Emerging |
| 1646 |
snapllm/snapllm
🔥 🔥 Alternative to Ollama 🔥 🔥 multi-model <1ms LLM switching |
|
Emerging |
| 1647 |
StupidTrees/SplitLLM
Split Learning Simulation Framework for LLMs |
|
Emerging |
| 1648 |
nlpkeg/Know-MRI
This is an official code for the [ACL 2025 Demo] paper: Know-MRI: A... |
|
Emerging |
| 1649 |
aws-samples/fine-tuning-llm-with-domain-knowledge
This repo walks you through how to use transfer learning to fine tune a LLM... |
|
Emerging |
| 1650 |
jhcho99/CoFormer
[CVPR'22] Official PyTorch Implementation of "Collaborative Transformers for... |
|
Emerging |
| 1651 |
chensyCN/llm4ea_official
[NeurIPS‘24] LLM4EA: Entity Alignment with Noisy Annotations from Large... |
|
Emerging |
| 1652 |
tommyip/mamba2-minimal
Minimal Mamba-2 implementation in PyTorch |
|
Emerging |
| 1653 |
whunextgen/LLMindCraft
Shaping Language Models with Cognitive Insights |
|
Emerging |
| 1654 |
varunshenoy/super-json-mode
Low latency JSON generation using LLMs ⚡️ |
|
Emerging |
| 1655 |
all-things-vits/code-samples
Holds code for our CVPR'23 tutorial: All Things ViTs: Understanding and... |
|
Emerging |
| 1656 |
readme-generator/alreadyme-ai-serving
Serving large language model with transformers |
|
Emerging |
| 1657 |
saltudelft/codefill
Contains the code and data for our #ICSE2022 paper titled as "CodeFill:... |
|
Emerging |
| 1658 |
Agora-Lab-AI/Atom
a suite of finetuned LLMs for atomically precise function calling 🧪 |
|
Emerging |
| 1659 |
ccmdi/geobench
GeoGuessr benchmark for language models |
|
Emerging |
| 1660 |
sandseb123/local-lora-cookbook
Fine-tune a local LLM on your own app's data in 15 minutes. Runs entirely... |
|
Emerging |
| 1661 |
canjiali/PARADE
code and data to faciliate BERT/ELECTRA for document ranking. Details refer... |
|
Emerging |
| 1662 |
VachanVY/Transfusion.torch
PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse... |
|
Emerging |
| 1663 |
jqwangai/Medical-LLM
A Repository of Medical Large Language Models |
|
Emerging |
| 1664 |
yangjianxin1/Firefly
Firefly:... |
|
Emerging |
| 1665 |
knagrecha/saturn
Saturn accelerates the training of large-scale deep learning models with a... |
|
Emerging |
| 1666 |
gentaiscool/miners
MINERS ⛏️: The semantic retrieval benchmark for evaluating multilingual... |
|
Emerging |
| 1667 |
nishad/llm-workshop-notebooks
Getting Started with Local LLMs - Workshop Notebooks |
|
Emerging |
| 1668 |
iiis-ai/cumulative-reasoning
[TMLR] Cumulative Reasoning With Large Language Models... |
|
Emerging |
| 1669 |
RLado/STB-VMM
STB-VMM: Swin Transformer Based Video Motion Magnification (official repository) |
|
Emerging |
| 1670 |
xmindflow/MS-Former
[MIDL 2023] MS-Former: Multi-Scale Self-Guided Transformer for Medical Image... |
|
Emerging |
| 1671 |
p-nordmann/eqx-llama
LLaMA implementation with Jax and Equinox |
|
Emerging |
| 1672 |
OSU-NLP-Group/AmpleGCG
AmpleGCG: Learning a Universal and Transferable Generator of Adversarial... |
|
Emerging |
| 1673 |
Harish25/StudyScreeningLanguageModel
Core LLM for M.A.R.S. (Model Assisted Review System). Utilizes fine-tuned... |
|
Emerging |
| 1674 |
xiangking/prompt_uie_torch
基于PaddleNLP开源的抽取式UIE进行医学命名实体识别(torch实现) |
|
Emerging |
| 1675 |
dirmacs/lancor
A Rust client library for llama.cpp's OpenAI-compatible API server |
|
Emerging |
| 1676 |
fangyuan-ksgk/Mini-LLaVA
A minimal implementation of LLaVA-style VLM with interleaved image & text &... |
|
Emerging |
| 1677 |
WANGXinyiLinda/concept-based-demonstration-selection
Offical code of the paper Large Language Models Are Implicitly Topic Models:... |
|
Emerging |
| 1678 |
locuslab/massive-activations
Code accompanying the paper "Massive Activations in Large Language Models" |
|
Emerging |
| 1679 |
SeungyounShin/Llama2-Code-Interpreter
Make Llama2 use Code Execution, Debug, Save Code, Reuse it, Access to Internet |
|
Emerging |
| 1680 |
adalkiran/llama-nuts-and-bolts
A holistic way of understanding how Llama and its components run in... |
|
Emerging |
| 1681 |
extreme-bert/extreme-bert
ExtremeBERT is a toolkit that accelerates the pretraining of customized... |
|
Emerging |
| 1682 |
AlphaPav/mem-kk-logic
On Memorization of Large Language Models in Logical Reasoning |
|
Emerging |
| 1683 |
user1342/Tomato
LLM steganography with minimum-entropy coupling - Hiding encrypted messages... |
|
Emerging |
| 1684 |
zjohn77/lightning-mlflow-hf
Use QLoRA to tune LLM in PyTorch-Lightning w/ Huggingface + MLflow |
|
Emerging |
| 1685 |
muhammad-fiaz/finetune-web-ui
Finetune Web UI is a user-interface for training and deploying pre-trained models. |
|
Emerging |
| 1686 |
YoannDev90/AlphaLLM
An AI Discord Bot generating text and images, advanced features, full... |
|
Emerging |
| 1687 |
Love-Asuka/Etude-LLM
"Etude"一词源自法语,原意为"研习曲"或"练习曲",在音乐领域特指为提高演奏技巧而创作的短小精悍的乐曲。在本项目中,"Etude... |
|
Emerging |
| 1688 |
alexa/ramen
A software for transferring pre-trained English models to foreign languages |
|
Emerging |
| 1689 |
torchspec-project/TorchSpec
A PyTorch native library for training speculative decoding models |
|
Emerging |
| 1690 |
LucknowAI/Lucknow-LLM
Collecting data for Building Lucknow's first LLM |
|
Emerging |
| 1691 |
kyegomez/AudioMamba
Implementation of the paper: "Audio Mamba: Bidirectional State Space Model... |
|
Emerging |
| 1692 |
potamides/uniformers
Token-free Language Modeling with ByGPT5 & Friends! |
|
Emerging |
| 1693 |
gyunggyung/LFM2-KoEn-Tuning
Fine-tuning LFM2-1.2B for Korean-English bidirectional translation.... |
|
Emerging |
| 1694 |
nrimsky/LM-exp
LLM experiments done during SERI MATS - focusing on activation steering /... |
|
Emerging |
| 1695 |
promptslab/LLMtuner
FineTune LLMs in few lines of code (Text2Text, Text2Speech, Speech2Text) |
|
Emerging |
| 1696 |
cahlen/conversation-dataset-generator
Craft conversational datasets (JSONL format with rich metadata) using LLMs.... |
|
Emerging |
| 1697 |
mhw32/prototransformer-public
PyTorch implementation for "ProtoTransformer: A Meta-Learning Approach to... |
|
Emerging |
| 1698 |
HomebrewML/HomebrewNLP-torch
A case study of efficient training of large language models using commodity hardware. |
|
Emerging |
| 1699 |
mantasu/cs224n
Solutions for CS224n (2022) |
|
Emerging |
| 1700 |
lliai/D2MoE
D^2-MoE: Delta Decompression for MoE-based LLMs Compression |
|
Emerging |