ylsung/VL_adapter

PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)

/ 100

Emerging

This project helps machine learning engineers or researchers efficiently adapt large pre-trained vision-and-language models for new image-text or video-text tasks. It takes existing models like VL-T5 or VL-BART along with your specific dataset (e.g., VQAv2, MSCOCO, TVQA), and outputs a specialized model that performs well on your task with significantly fewer parameters to train. This is ideal for those working on multimodal AI applications.

210 stars. No commits in the last 6 months.

Use this if you need to fine-tune large vision-and-language models for new downstream tasks without the computational cost of training all model parameters.

Not ideal if you are looking for a ready-to-use, off-the-shelf application and are not comfortable with model training and script execution.

multimodal-ai vision-language-models transfer-learning natural-language-processing computer-vision

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

210

Forks

Language

Python

License

MIT

Compare

VL_adapter and adapters

Higher-rated alternatives

adapter-hub/adapters

A Unified Library for Parameter-Efficient and Modular Transfer Learning

gaussalgo/adaptor

ACL 2022: Adaptor: a library to easily adapt a language model to your own task, domain, or...

intersun/LightningDOT

source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT

kyegomez/M2PT

Implementation of M2PT in PyTorch from the paper: "Multimodal Pathway: Improve Transformers with...

calpt/awesome-adapter-resources

Collection of Tools and Papers related to Adapters / Parameter-Efficient Transfer Learning/ Fine-Tuning

Explore Transformer Models

All categories Trending Transformer directory Insights