HyperGAI/HPT

HPT - Open Multimodal LLMs from HyperGAI

/ 100

Emerging

This project provides pre-trained AI models that can understand both text and images simultaneously. You input an image and a text prompt, and the model processes them together to answer questions or generate insights about the visual content. It's designed for AI developers or researchers who want to build applications that interpret complex visual scenes combined with linguistic queries.

313 stars. No commits in the last 6 months.

Use this if you are an AI developer looking to integrate or evaluate multimodal large language models capable of interpreting both visual and textual information into your own applications, especially for devices with limited computing power.

Not ideal if you are an end-user without programming experience seeking a ready-to-use application, or if your task only involves text-based or image-based analysis, not both.

multimodal-ai computer-vision natural-language-processing edge-ai ai-model-evaluation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

313

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

NVIDIA-NeMo/NeMo

A scalable generative AI framework built for researchers and developers working on Large...

alexiglad/EBT

PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learning

vlm-run/vlmrun-hub

A hub for various industry-specific schemas to be used with VLMs.

yash9439/Falcon-Local-AI-Model

Explore this GitHub repository housing 3 versions of Falcon code for text generation. Each...

bastien-muraccioli/svlr

SVLR: Scalable, Training-Free Visual Language Robotics: a modular multi-model framework for...

Explore Generative AI Tools

All categories Trending Generative AI directory Insights