HyperGAI/HPT
HPT - Open Multimodal LLMs from HyperGAI
This project provides pre-trained AI models that can understand both text and images simultaneously. You input an image and a text prompt, and the model processes them together to answer questions or generate insights about the visual content. It's designed for AI developers or researchers who want to build applications that interpret complex visual scenes combined with linguistic queries.
313 stars. No commits in the last 6 months.
Use this if you are an AI developer looking to integrate or evaluate multimodal large language models capable of interpreting both visual and textual information into your own applications, especially for devices with limited computing power.
Not ideal if you are an end-user without programming experience seeking a ready-to-use application, or if your task only involves text-based or image-based analysis, not both.
Stars
313
Forks
22
Language
Python
License
Apache-2.0
Category
Last pushed
Jun 06, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/HyperGAI/HPT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NVIDIA-NeMo/NeMo
A scalable generative AI framework built for researchers and developers working on Large...
alexiglad/EBT
PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learning
vlm-run/vlmrun-hub
A hub for various industry-specific schemas to be used with VLMs.
yash9439/Falcon-Local-AI-Model
Explore this GitHub repository housing 3 versions of Falcon code for text generation. Each...
bastien-muraccioli/svlr
SVLR: Scalable, Training-Free Visual Language Robotics: a modular multi-model framework for...