invictus717/MetaTransformer
Meta-Transformer for Unified Multimodal Learning
This project offers a unified approach to analyzing diverse datasets, from financial market data and weather patterns to medical images and social media feeds. It takes in various data types like text, images, videos, audio, and sensor readings, and outputs structured insights for tasks like classification, detection, or segmentation. Traders, climate scientists, medical professionals, and autonomous driving engineers can use this to make sense of complex, multi-source information.
1,654 stars. No commits in the last 6 months.
Use this if you need a single, powerful tool to process and understand information from many different sources and formats, such as text, images, videos, audio, and sensor data.
Not ideal if your work exclusively involves a single, very specific type of data and you don't anticipate integrating information from other modalities.
Stars
1,654
Forks
117
Language
Python
License
Apache-2.0
Category
Last pushed
Dec 05, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/invictus717/MetaTransformer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
dorarad/gansformer
Generative Adversarial Transformers
j-min/VL-T5
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
rkansal47/MPGAN
The message passing GAN https://arxiv.org/abs/2106.11535 and generative adversarial particle...
Yachay-AI/byt5-geotagging
Confidence and Byt5 - based geotagging model predicting coordinates from text alone.
sisinflab/Ducho
Ducho is a Python framework aimed to extract multimodal features used in multimodal...