young-geng/m3ae_public
Multimodal Masked Autoencoders (M3AE): A JAX/Flax Implementation
This project helps machine learning researchers pre-train advanced AI models that understand both images and text. It takes in datasets containing images, text, or paired image-text data and produces powerful, pre-trained models. These models can then be fine-tuned for specialized tasks like image classification.
107 stars. No commits in the last 6 months.
Use this if you are a machine learning researcher or engineer looking to train state-of-the-art multimodal AI models efficiently, especially on large datasets using cloud GPUs or TPUs.
Not ideal if you are a business user looking for a ready-to-use application or a developer who needs a simple API for common image/text processing tasks.
Stars
107
Forks
12
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/young-geng/m3ae_public"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ThilinaRajapakse/simpletransformers
Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling,...
jsksxs360/How-to-use-Transformers
Transformers 库快速入门教程
google/deepconsensus
DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences...
Denis2054/Transformers-for-NLP-2nd-Edition
Transformer models from BERT to GPT-4, environments from Hugging Face to OpenAI. Fine-tuning,...
abhimishra91/transformers-tutorials
Github repo with tutorials to fine tune transformers for diff NLP tasks