YYZhang2025/Pali-Gemma

Implement Multi-Modality-LLM and fine tuning the model using LoRA. Only depends on the PyTorch, no other "fancy" library

22
/ 100
Experimental

This tool helps you build and refine AI models that can understand both images and text. You can input various images and corresponding text descriptions, and it produces a fine-tuned model capable of processing and generating insights from new multi-modal data. It's designed for AI practitioners or researchers looking to customize multi-modal large language models.

Use this if you need to adapt an AI model to interpret and reason about information presented in both visual and textual formats for a specific application.

Not ideal if you are looking for a ready-to-use, pre-trained multi-modal AI model without any customization or fine-tuning requirements.

AI model customization multi-modal AI deep learning fine-tuning computer vision AI natural language processing AI
No License No Package No Dependents
Maintenance 10 / 25
Adoption 5 / 25
Maturity 7 / 25
Community 0 / 25

How are scores calculated?

Stars

9

Forks

Language

Jupyter Notebook

License

Category

llm-fine-tuning

Last pushed

Feb 12, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/YYZhang2025/Pali-Gemma"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.