Paranioar/Awesome_Matching_Pretraining_Transfering
The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
This is a curated collection of research papers and tutorials focused on large multi-modality models, covering their perception, generation, and unification capabilities. It also includes resources on efficient finetuning techniques and methods for vision-language pretraining. Researchers and AI practitioners working on advanced AI models would use this to understand the current landscape and latest advancements in combining different data types like images and text.
445 stars. No commits in the last 6 months.
Use this if you are a researcher or AI practitioner looking for a comprehensive overview and resources on multi-modal AI models, including efficient training and pretraining techniques.
Not ideal if you are an end-user seeking a ready-to-use AI tool or application for a specific task rather than academic research and model development.
Stars
445
Forks
49
Language
—
License
MIT
Category
Last pushed
Sep 25, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Paranioar/Awesome_Matching_Pretraining_Transfering"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
FoundationVision/Liquid
(Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generators
Yangyi-Chen/Multimodal-AND-Large-Language-Models
Paper list about multimodal and large language models, only used to record papers I read in the...
thuml/AutoTimes
Official implementation for "AutoTimes: Autoregressive Time Series Forecasters via Large Language Models"
flixpar/med-ts-llm
MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis