salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
This helps developers and researchers integrate advanced AI models that understand both images and text into their applications. You can input images and questions, and it will generate descriptive captions or answer questions about the visuals. It's designed for AI practitioners building multimodal systems.
11,183 stars. No commits in the last 6 months.
Use this if you are an AI developer or researcher looking to quickly implement and experiment with state-of-the-art vision-language models for tasks like image captioning, visual question answering, or text-to-image generation.
Not ideal if you are an end-user without programming knowledge, as this is a library for building AI solutions, not a ready-to-use application.
Stars
11,183
Forks
1,101
Language
Jupyter Notebook
License
BSD-3-Clause
Category
Last pushed
Nov 18, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/salesforce/LAVIS"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M...
devrimcavusoglu/pybboxes
Light weight toolkit for bounding boxes providing conversion between bounding box types and...
PyRetri/PyRetri
Open source deep learning based unsupervised image retrieval toolbox built on PyTorch🔥
Particle1904/DatasetHelpers
Dataset Helper program to automatically select, re scale and tag Datasets (composed of image and...
haltakov/natural-language-image-search
Search photos on Unsplash using natural language