YYZhang2025/Pali-Gemma
Implement Multi-Modality-LLM and fine tuning the model using LoRA. Only depends on the PyTorch, no other "fancy" library
This tool helps you build and refine AI models that can understand both images and text. You can input various images and corresponding text descriptions, and it produces a fine-tuned model capable of processing and generating insights from new multi-modal data. It's designed for AI practitioners or researchers looking to customize multi-modal large language models.
Use this if you need to adapt an AI model to interpret and reason about information presented in both visual and textual formats for a specific application.
Not ideal if you are looking for a ready-to-use, pre-trained multi-modal AI model without any customization or fine-tuning requirements.
Stars
9
Forks
—
Language
Jupyter Notebook
License
—
Category
Last pushed
Feb 12, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/YYZhang2025/Pali-Gemma"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
OptimalScale/LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
adithya-s-k/AI-Engineering.academy
Mastering Applied AI, One Concept at a Time
jax-ml/jax-llm-examples
Minimal yet performant LLM examples in pure JAX
young-geng/scalax
A simple library for scaling up JAX programs
riyanshibohra/TuneKit
Upload your data → Get a fine-tuned SLM. Free.