howard-hou/VisualRWKV

VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.

/ 100

Emerging

VisualRWKV helps developers integrate visual understanding into language models. It takes images and text as input and produces text responses that interpret the visual information, much like how a human would describe or analyze a picture. This is for machine learning engineers and researchers working on multimodal AI applications.

244 stars.

Use this if you are a machine learning engineer or researcher building a visual language model and want to explore the RWKV architecture for handling image-based tasks.

Not ideal if you are an end-user looking for a ready-to-use application, as this is a foundational model for developers to build upon.

multimodal-ai computer-vision natural-language-processing large-language-models ai-model-development

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

244

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

thu-pacman/chitu

High-performance inference framework for large language models, focusing on efficiency,...

sophgo/LLM-TPU

Run generative AI models in sophgo BM1684X/BM1688

NotPunchnox/rkllama

Ollama alternative for Rockchip NPU: An efficient solution for running AI and Deep learning...

Deep-Spark/DeepSparkHub

DeepSparkHub selects hundreds of application algorithms and models, covering various fields of...

bentoml/llm-inference-handbook

Everything you need to know about LLM inference

Explore LLM Tools

All categories Trending LLM Tool directory Insights