AILab-CVC/M2PT

[CVPR 2024] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities

32
/ 100
Emerging

This project helps AI researchers and practitioners enhance their machine learning models for tasks like image, video, point cloud, or audio recognition. It takes an existing Transformer model trained on one type of data (e.g., images) and improves its performance by integrating insights from a separate Transformer trained on entirely different, unrelated data (e.g., audio). The result is a more robust and accurate model for the original task without needing new task-specific data or incurring extra processing costs during use.

101 stars. No commits in the last 6 months.

Use this if you are a machine learning researcher or engineer looking to boost the accuracy of your Transformer models for specific tasks like image or audio classification by leveraging knowledge from models trained on other data types.

Not ideal if you are looking for a ready-to-use application rather than a method for improving existing deep learning models, or if your tasks do not involve Transformer architectures.

deep-learning-research computer-vision audio-analysis 3d-point-cloud-processing multimodal-ai
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 7 / 25

How are scores calculated?

Stars

101

Forks

5

Language

Python

License

Apache-2.0

Last pushed

Mar 13, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/AILab-CVC/M2PT"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.