roboflow/maestro
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
This tool helps machine learning engineers and researchers quickly customize advanced vision-language models like Florence-2 or PaliGemma 2 for specific tasks. You provide your specialized image and text datasets, and it streamlines the entire process from data loading to training, outputting a fine-tuned model ready for deployment. This is for technical users familiar with AI model training concepts.
2,661 stars.
Use this if you need to adapt powerful existing multimodal AI models to perform new, specialized tasks like custom object detection or extracting specific information from images paired with text.
Not ideal if you are looking for a no-code solution or want to build a multimodal model entirely from scratch rather than fine-tuning an existing one.
Stars
2,661
Forks
221
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 09, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/roboflow/maestro"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
unslothai/unsloth
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama,...
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
modelscope/ms-swift
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5,...
oumi-ai/oumi
Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training