Ekoda/SoftMoE

Soft Mixture of Experts Vision Transformer, addressing MoE limitations as highlighted by Puigcerver et al., 2023.

/ 100

Experimental

This project provides an implementation of a Soft Mixture of Experts Vision Transformer, designed for developers working with large-scale deep learning models. It takes in model configurations and image data, producing a trained or fine-tuned vision transformer model that avoids common issues like token dropping and training instability found in traditional Mixture of Experts (MoE) architectures. This is for machine learning engineers and researchers building or experimenting with vision models.

No commits in the last 6 months.

Use this if you are developing computer vision models and want to leverage Mixture of Experts architectures while avoiding the common training and scaling limitations of sparse MoE models.

Not ideal if you are looking for a pre-trained model for immediate use or are not comfortable working with deep learning model implementations at a code level.

deep-learning computer-vision machine-learning-engineering model-architecture neural-networks

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

Westlake-AI/openmixup

CAIRI Supervised, Semi- and Self-Supervised Visual Representation Learning Toolbox and Benchmark

YU1ut/MixMatch-pytorch

Code for "MixMatch - A Holistic Approach to Semi-Supervised Learning"

kamata1729/QATM_pytorch

Pytorch Implementation of QATM:Quality-Aware Template Matching For Deep Learning

nttcslab/msm-mae

Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations

rgeirhos/generalisation-humans-DNNs

Data, code & materials from the paper "Generalisation in humans and deep neural networks" (NeurIPS 2018)

Explore ML Frameworks

All categories Trending ML Framework directory Insights