graldij/transformer-fusion

Official repository of the "Transformer Fusion with Optimal Transport" paper, published as a conference paper at ICLR 2024.

/ 100

Emerging

This project helps machine learning engineers and researchers combine the knowledge from multiple independently trained transformer models into a single, more powerful model. You input two or more existing transformer models (like those used for image classification or natural language processing) and it outputs a new, 'fused' transformer model that often outperforms its individual parents. This is useful for improving model performance or creating more compact models by merging different specialized transformers.

No commits in the last 6 months.

Use this if you need to combine the capabilities of several existing transformer models to achieve better performance or simplify your model deployment.

Not ideal if you are working with non-transformer neural network architectures, as this method is specifically designed for transformers.

deep-learning model-fusion natural-language-processing computer-vision model-compression

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

Jittor/jittor

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

zhanghang1989/ResNeSt

ResNeSt: Split-Attention Networks

berniwal/swin-transformer-pytorch

Implementation of the Swin Transformer in PyTorch.

NVlabs/FasterViT

[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with...

ViTAE-Transformer/ViTPose

The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose...

Explore ML Frameworks

All categories Trending ML Framework directory Insights