ByteVisionLab/TokenFlow
[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
This project offers a powerful tool for AI developers working with both image understanding and image generation tasks. It takes raw images or text prompts as input and outputs highly optimized image tokens that can be used to improve the performance of large multimodal models (LMMs) for tasks like visual question answering or text-to-image creation. It's ideal for machine learning engineers and researchers focused on building or enhancing AI models that interpret and create visual content.
449 stars. No commits in the last 6 months.
Use this if you are developing AI models that need to both understand and generate images, and you want a unified approach to process visual data for these tasks.
Not ideal if you are looking for an end-user application or a tool for image editing, as this is a foundational component for AI model development.
Stars
449
Forks
9
Language
Python
License
Apache-2.0
Category
Last pushed
Aug 08, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/ByteVisionLab/TokenFlow"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
zai-org/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
zhaorw02/DeepMesh
[ICCV 2025] Official code of DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
YangLing0818/RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with...
thu-nics/FrameFusion
[ICCV'25] The official code of paper "Combining Similarity and Importance for Video Token...
Yushi-Hu/tifa
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering