jianzhnie/MultimodalTookit

Incorporate Image, Text and Tabular Data with HuggingFace Transformers

/ 100

Experimental

This toolkit helps you make better predictions or classifications using a mix of data types, like customer reviews (text), product details (numbers), and images. It takes these different kinds of information, processes them, and then outputs a prediction or a category, such as whether a customer will recommend a product or the likelihood of pet adoption. It's for data scientists and machine learning engineers who need to build robust models from diverse datasets.

No commits in the last 6 months.

Use this if you need to build a machine learning model that predicts an outcome or classifies data, and your input data includes a combination of text, images, and traditional numerical or categorical information.

Not ideal if your dataset only contains a single data type (e.g., only text or only tabular numbers) or if you are not working with prediction or classification tasks.

predictive-modeling multi-modal-data-analysis classification regression data-science

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

Apache-2.0

Higher-rated alternatives

dorarad/gansformer

Generative Adversarial Transformers

j-min/VL-T5

PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)

invictus717/MetaTransformer

Meta-Transformer for Unified Multimodal Learning

rkansal47/MPGAN

The message passing GAN https://arxiv.org/abs/2106.11535 and generative adversarial particle...

Yachay-AI/byt5-geotagging

Confidence and Byt5 - based geotagging model predicting coordinates from text alone.

Explore Transformer Models

All categories Trending Transformer directory Insights