All Transformer Models

7,795 models ranked by quality score · Page 17 of 78

Showing 1601–1700 of 7,795
# Model Score Tier
1601 WisconsinAIVision/ViP-LLaVA

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary...

38
Emerging
1602 EagleW/Scientific-Inspiration-Machines-Optimized-for-Novelty

Official implementation of the ACL 2024: Scientific Inspiration Machines...

38
Emerging
1603 HOLYKEYZ/model-unfetter

The production engine for directional ablation. Unalign / remove models...

38
Emerging
1604 NisaarAgharia/Indian-LawyerGPT

Fine-Tuning Falcon-7B, LLAMA 2 with QLoRA to create an advanced AI model...

38
Emerging
1605 Yxxxb/VoCo-LLaMA

[CVPR'2025] VoCo-LLaMA: This repo is the official implementation of...

38
Emerging
1606 rasbt/pytorch-memory-optim

This code repository contains the code used for my "Optimizing Memory Usage...

38
Emerging
1607 RishabSA/interp-refusal-tokens

We study whether categorical refusal tokens enable controllable and...

38
Emerging
1608 Jagatmohan46/tiny-recursive-model

🚀 Implement the Tiny Recursive Model (TRM) for improved performance in...

38
Emerging
1609 ParCIS/Chimera

Chimera: bidirectional pipeline parallelism for efficiently training...

38
Emerging
1610 hscspring/llama.np

Inference Llama/Llama2/Llama3 Modes in NumPy

38
Emerging
1611 BarCodeReader/SelfReformer

[TMM-2023] Official implementation of "Towards Complete and Detail-Preserved...

38
Emerging
1612 The-Swarm-Corporation/Hyena-Y

A PyTorch implementation of the Hyena-Y model, a convolution-based...

38
Emerging
1613 gnai-creator/aletheion-llm-v2

Decoder-only LLM with integrated epistemic tomography. Knows what it doesn't know.

38
Emerging
1614 readytensor/rt-llm-eng-cert-week3

Week 3 of LLM Engineering Certification: Learn to fine-tune large language...

38
Emerging
1615 ictnlp/BayLing

“百聆”是一个基于LLaMA的语言对齐增强的英语/中文大语言模型,具有优越的英语/中文能力,在多语言和通用任务等多项测试中取得ChatGPT...

38
Emerging
1616 ChanMeng666/interactive-story-generator

【Join our constellation of stargazers!⭐️】An interactive AI-powered story...

38
Emerging
1617 dropbox/grallama-panel

GraLLAMA panel for LLAMA data

38
Emerging
1618 matlab-deep-learning/transformer-networks-for-time-series-prediction

Deep Learning in Quantitative Finance: Transformer Networks for Time Series...

38
Emerging
1619 sshh12/llm_optimize

LLM Optimize is a proof-of-concept library for doing LLM (large language...

38
Emerging
1620 thruthseeker/LionLock_FDE_OSS

Open source fatigue detection engine for large language models with trust overlay

38
Emerging
1621 VITA-Group/LiGO

[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer...

38
Emerging
1622 mtuann/llm-updated-papers

Papers related to Large Language Models in all top venues

38
Emerging
1623 ariannamethod/doe

DoE Janus Architecture: Democracy of Experts

38
Emerging
1624 flozi00/atra

An open source NLP as a service project focused on providing state of the...

37
Emerging
1625 ximinng/LLM4SVG

[CVPR 2025] Official implementation for "Empowering LLMs to Understand and...

37
Emerging
1626 GT-RIPL/robo-vln

Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics...

37
Emerging
1627 awinml/llama-cpp-python-bindings

Run fast LLM Inference using Llama.cpp in Python

37
Emerging
1628 litus-ai/classy

classy is a simple-to-use library for building high-performance Machine...

37
Emerging
1629 K024/llm-sharp

Language models in C#

37
Emerging
1630 coderonion/awesome-llm-and-aigc

🚀🚀🚀A collection of some awesome public projects about Large Language...

37
Emerging
1631 voidism/DoLa

Official implementation for the paper "DoLa: Decoding by Contrasting Layers...

37
Emerging
1632 declare-lab/exemplary-empathy

This repository contains the source codes of the paper -- Exemplars-guided...

37
Emerging
1633 ImKeTT/AdaVAE

[Preprint] AdaVAE: Exploring Adaptive GPT-2s in VAEs for Language Modeling...

37
Emerging
1634 TIGER-AI-Lab/VL-Rethinker

The official code of "VL-Rethinker: Incentivizing Self-Reflection of...

37
Emerging
1635 HKUNLP/icl-ceil

[ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.

37
Emerging
1636 mohyunho/NAS_transformer

Evolutionary Neural Architecture Search on Transformers for RUL Prediction

37
Emerging
1637 DAMO-NLP-SG/LLM-Multilingual-Knowledge-Boundaries

[ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across...

37
Emerging
1638 casinca/LLM-quest

Verbose implementations of LLMs architectures, techniques and research...

37
Emerging
1639 CognitiveAISystems/RATE

[ICLR 2026] Official implementation of Recurrent Action Transformer with...

37
Emerging
1640 kyegomez/MGQA

The open source implementation of the multi grouped query attention by the...

37
Emerging
1641 joelbarmettlerUZH/ConceptFormer

Towards Finding the Essence of Everything in Large Language Models

37
Emerging
1642 iil-postech/semantic-attention

Official implementation of "Attention-aware semantic communications for...

37
Emerging
1643 pagraf/Seabed-Net

Quick start guide for Seabed-Net

37
Emerging
1644 Shanghai-Digital-Brain-Laboratory/BDM-DB1

A large-scale multi-modal pre-trained model

37
Emerging
1645 justADeni/intel-npu-llm

A simple Python script for running LLMs on Intel's Neural Processing Units (NPUs)

37
Emerging
1646 snapllm/snapllm

🔥 🔥 Alternative to Ollama 🔥 🔥 multi-model <1ms LLM switching

37
Emerging
1647 StupidTrees/SplitLLM

Split Learning Simulation Framework for LLMs

37
Emerging
1648 nlpkeg/Know-MRI

This is an official code for the [ACL 2025 Demo] paper: Know-MRI: A...

37
Emerging
1649 aws-samples/fine-tuning-llm-with-domain-knowledge

This repo walks you through how to use transfer learning to fine tune a LLM...

37
Emerging
1650 jhcho99/CoFormer

[CVPR'22] Official PyTorch Implementation of "Collaborative Transformers for...

37
Emerging
1651 chensyCN/llm4ea_official

[NeurIPS‘24] LLM4EA: Entity Alignment with Noisy Annotations from Large...

37
Emerging
1652 tommyip/mamba2-minimal

Minimal Mamba-2 implementation in PyTorch

37
Emerging
1653 whunextgen/LLMindCraft

Shaping Language Models with Cognitive Insights

37
Emerging
1654 varunshenoy/super-json-mode

Low latency JSON generation using LLMs ⚡️

37
Emerging
1655 all-things-vits/code-samples

Holds code for our CVPR'23 tutorial: All Things ViTs: Understanding and...

37
Emerging
1656 readme-generator/alreadyme-ai-serving

Serving large language model with transformers

37
Emerging
1657 saltudelft/codefill

Contains the code and data for our #ICSE2022 paper titled as "CodeFill:...

37
Emerging
1658 Agora-Lab-AI/Atom

a suite of finetuned LLMs for atomically precise function calling 🧪

37
Emerging
1659 ccmdi/geobench

GeoGuessr benchmark for language models

37
Emerging
1660 sandseb123/local-lora-cookbook

Fine-tune a local LLM on your own app's data in 15 minutes. Runs entirely...

37
Emerging
1661 canjiali/PARADE

code and data to faciliate BERT/ELECTRA for document ranking. Details refer...

37
Emerging
1662 VachanVY/Transfusion.torch

PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse...

37
Emerging
1663 jqwangai/Medical-LLM

A Repository of Medical Large Language Models

37
Emerging
1664 yangjianxin1/Firefly

Firefly:...

37
Emerging
1665 knagrecha/saturn

Saturn accelerates the training of large-scale deep learning models with a...

37
Emerging
1666 gentaiscool/miners

MINERS ⛏️: The semantic retrieval benchmark for evaluating multilingual...

37
Emerging
1667 nishad/llm-workshop-notebooks

Getting Started with Local LLMs - Workshop Notebooks

37
Emerging
1668 iiis-ai/cumulative-reasoning

[TMLR] Cumulative Reasoning With Large Language Models...

37
Emerging
1669 RLado/STB-VMM

STB-VMM: Swin Transformer Based Video Motion Magnification (official repository)

37
Emerging
1670 xmindflow/MS-Former

[MIDL 2023] MS-Former: Multi-Scale Self-Guided Transformer for Medical Image...

37
Emerging
1671 p-nordmann/eqx-llama

LLaMA implementation with Jax and Equinox

37
Emerging
1672 OSU-NLP-Group/AmpleGCG

AmpleGCG: Learning a Universal and Transferable Generator of Adversarial...

37
Emerging
1673 Harish25/StudyScreeningLanguageModel

Core LLM for M.A.R.S. (Model Assisted Review System). Utilizes fine-tuned...

37
Emerging
1674 xiangking/prompt_uie_torch

基于PaddleNLP开源的抽取式UIE进行医学命名实体识别(torch实现)

37
Emerging
1675 dirmacs/lancor

A Rust client library for llama.cpp's OpenAI-compatible API server

37
Emerging
1676 fangyuan-ksgk/Mini-LLaVA

A minimal implementation of LLaVA-style VLM with interleaved image & text &...

37
Emerging
1677 WANGXinyiLinda/concept-based-demonstration-selection

Offical code of the paper Large Language Models Are Implicitly Topic Models:...

37
Emerging
1678 locuslab/massive-activations

Code accompanying the paper "Massive Activations in Large Language Models"

37
Emerging
1679 SeungyounShin/Llama2-Code-Interpreter

Make Llama2 use Code Execution, Debug, Save Code, Reuse it, Access to Internet

37
Emerging
1680 adalkiran/llama-nuts-and-bolts

A holistic way of understanding how Llama and its components run in...

37
Emerging
1681 extreme-bert/extreme-bert

ExtremeBERT is a toolkit that accelerates the pretraining of customized...

37
Emerging
1682 AlphaPav/mem-kk-logic

On Memorization of Large Language Models in Logical Reasoning

37
Emerging
1683 user1342/Tomato

LLM steganography with minimum-entropy coupling - Hiding encrypted messages...

37
Emerging
1684 zjohn77/lightning-mlflow-hf

Use QLoRA to tune LLM in PyTorch-Lightning w/ Huggingface + MLflow

37
Emerging
1685 muhammad-fiaz/finetune-web-ui

Finetune Web UI is a user-interface for training and deploying pre-trained models.

37
Emerging
1686 YoannDev90/AlphaLLM

An AI Discord Bot generating text and images, advanced features, full...

37
Emerging
1687 Love-Asuka/Etude-LLM

"Etude"一词源自法语,原意为"研习曲"或"练习曲",在音乐领域特指为提高演奏技巧而创作的短小精悍的乐曲。在本项目中,"Etude...

37
Emerging
1688 alexa/ramen

A software for transferring pre-trained English models to foreign languages

37
Emerging
1689 torchspec-project/TorchSpec

A PyTorch native library for training speculative decoding models

37
Emerging
1690 LucknowAI/Lucknow-LLM

Collecting data for Building Lucknow's first LLM

37
Emerging
1691 kyegomez/AudioMamba

Implementation of the paper: "Audio Mamba: Bidirectional State Space Model...

37
Emerging
1692 potamides/uniformers

Token-free Language Modeling with ByGPT5 & Friends!

37
Emerging
1693 gyunggyung/LFM2-KoEn-Tuning

Fine-tuning LFM2-1.2B for Korean-English bidirectional translation....

37
Emerging
1694 nrimsky/LM-exp

LLM experiments done during SERI MATS - focusing on activation steering /...

37
Emerging
1695 promptslab/LLMtuner

FineTune LLMs in few lines of code (Text2Text, Text2Speech, Speech2Text)

37
Emerging
1696 cahlen/conversation-dataset-generator

Craft conversational datasets (JSONL format with rich metadata) using LLMs....

37
Emerging
1697 mhw32/prototransformer-public

PyTorch implementation for "ProtoTransformer: A Meta-Learning Approach to...

37
Emerging
1698 HomebrewML/HomebrewNLP-torch

A case study of efficient training of large language models using commodity hardware.

37
Emerging
1699 mantasu/cs224n

Solutions for CS224n (2022)

37
Emerging
1700 lliai/D2MoE

D^2-MoE: Delta Decompression for MoE-based LLMs Compression

37
Emerging
« Prev 1 2 3 15 16 17 18 19 76 77 78 Next »