yuhui-zh15/C3

Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)

/ 100

Experimental

This project helps AI researchers and practitioners build applications that process and generate different types of media, such as image to text, audio to text, or text to image. It takes existing pre-trained models and uni-modal data (like a collection of images without their descriptions) and produces models capable of understanding or generating content across modalities. This is ideal for those developing advanced AI systems in computer vision, natural language processing, and audio analysis who face challenges with acquiring paired multi-modal datasets.

No commits in the last 6 months.

Use this if you need to develop AI models that translate between different data types (like generating text from an image or an image from text) but only have access to large amounts of single-modality data.

Not ideal if you already have extensive paired multi-modal datasets or are working on tasks that don't involve translating between different media types.

AI model development cross-modal learning zero-shot learning computer vision natural language processing

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 3 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

AdaptiveMotorControlLab/CEBRA

Learnable latent embeddings for joint behavioral and neural analysis - Official implementation of CEBRA

theolepage/sslsv

Toolkit for training and evaluating Self-Supervised Learning (SSL) frameworks for Speaker...

PaddlePaddle/PASSL

PASSL包含 SimCLR，MoCo v1/v2，BYOL，CLIP，PixPro，simsiam, SwAV, BEiT，MAE 等图像自监督算法以及 Vision...

YGZWQZD/LAMDA-SSL

30 Semi-Supervised Learning Algorithms

ModSSC/ModSSC

ModSSC: A Modular Framework for Semi Supervised Classification

Explore ML Frameworks

All categories Trending ML Framework directory Insights