FoundationVision/Liquid

(Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generators

/ 100

Emerging

Liquid is a tool for creative professionals and researchers that takes text descriptions or existing images and generates new, high-quality images. It can also understand and describe visual content. This is designed for anyone needing to generate diverse visual content or perform deep image analysis without relying on separate tools for different modalities.

640 stars.

Use this if you need a single, powerful system to both generate images from text and analyze existing images to extract information or descriptions.

Not ideal if you primarily need simple image editing or extremely fast, lightweight image generation without complex understanding capabilities.

generative-AI content-creation image-synthesis visual-understanding AI-research

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

640

Forks

Language

Python

License

MIT

Higher-rated alternatives

BradyFU/Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Paranioar/Awesome_Matching_Pretraining_Transfering

The Paper List of Large Multi-Modality Model (Perception, Generation, Unification),...

Yangyi-Chen/Multimodal-AND-Large-Language-Models

Paper list about multimodal and large language models, only used to record papers I read in the...

thuml/AutoTimes

Official implementation for "AutoTimes: Autoregressive Time Series Forecasters via Large Language Models"

flixpar/med-ts-llm

MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis

Explore Transformer Models

All categories Trending Transformer directory Insights