RobinDong/tiny_multimodal

Tiny and simple implementation of multimodal models

20
/ 100
Experimental

This project helps machine learning engineers and researchers quickly experiment with foundational multimodal models. It takes image-text datasets as input and allows you to train, fine-tune, or deploy compact versions of models that can understand both images and text. This is ideal for individuals working on developing and optimizing AI models who need efficient experimentation.

No commits in the last 6 months.

Use this if you are an AI/ML developer or researcher looking to explore multimodal models on standard consumer-grade GPUs without needing massive computational resources.

Not ideal if you are an end-user needing an out-of-the-box solution for image analysis or text generation, or if you require full-scale, production-ready large multimodal models.

multimodal-ai model-training deep-learning ai-prototyping computer-vision
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

8

Forks

Language

Python

License

MIT

Last pushed

Aug 20, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/RobinDong/tiny_multimodal"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.