graldij/transformer-fusion

Official repository of the "Transformer Fusion with Optimal Transport" paper, published as a conference paper at ICLR 2024.

32
/ 100
Emerging

This project helps machine learning engineers and researchers combine the knowledge from multiple independently trained transformer models into a single, more powerful model. You input two or more existing transformer models (like those used for image classification or natural language processing) and it outputs a new, 'fused' transformer model that often outperforms its individual parents. This is useful for improving model performance or creating more compact models by merging different specialized transformers.

No commits in the last 6 months.

Use this if you need to combine the capabilities of several existing transformer models to achieve better performance or simplify your model deployment.

Not ideal if you are working with non-transformer neural network architectures, as this method is specifically designed for transformers.

deep-learning model-fusion natural-language-processing computer-vision model-compression
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 17 / 25

How are scores calculated?

Stars

31

Forks

8

Language

Python

License

Last pushed

Apr 19, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/graldij/transformer-fusion"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.