salesforce/LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

/ 100

Emerging

This helps developers and researchers integrate advanced AI models that understand both images and text into their applications. You can input images and questions, and it will generate descriptive captions or answer questions about the visuals. It's designed for AI practitioners building multimodal systems.

11,183 stars. No commits in the last 6 months.

Use this if you are an AI developer or researcher looking to quickly implement and experiment with state-of-the-art vision-language models for tasks like image captioning, visual question answering, or text-to-image generation.

Not ideal if you are an end-user without programming knowledge, as this is a library for building AI solutions, not a ready-to-use application.

visual-question-answering image-captioning multimodal-ai-development text-to-image-generation ai-model-benchmarking

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

11,183

Forks

1,101

Language

Jupyter Notebook

License

BSD-3-Clause

Higher-rated alternatives

rom1504/img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M...

devrimcavusoglu/pybboxes

Light weight toolkit for bounding boxes providing conversion between bounding box types and...

PyRetri/PyRetri

Open source deep learning based unsupervised image retrieval toolbox built on PyTorch🔥

Particle1904/DatasetHelpers

Dataset Helper program to automatically select, re scale and tag Datasets (composed of image and...

haltakov/natural-language-image-search

Search photos on Unsplash using natural language

Explore ML Frameworks

All categories Trending ML Framework directory Insights