graldij/transformer-fusion
Official repository of the "Transformer Fusion with Optimal Transport" paper, published as a conference paper at ICLR 2024.
This project helps machine learning engineers and researchers combine the knowledge from multiple independently trained transformer models into a single, more powerful model. You input two or more existing transformer models (like those used for image classification or natural language processing) and it outputs a new, 'fused' transformer model that often outperforms its individual parents. This is useful for improving model performance or creating more compact models by merging different specialized transformers.
No commits in the last 6 months.
Use this if you need to combine the capabilities of several existing transformer models to achieve better performance or simplify your model deployment.
Not ideal if you are working with non-transformer neural network architectures, as this method is specifically designed for transformers.
Stars
31
Forks
8
Language
Python
License
—
Category
Last pushed
Apr 19, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/graldij/transformer-fusion"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Jittor/jittor
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
zhanghang1989/ResNeSt
ResNeSt: Split-Attention Networks
berniwal/swin-transformer-pytorch
Implementation of the Swin Transformer in PyTorch.
NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with...
ViTAE-Transformer/ViTPose
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose...