JosefAlbers/VL-JEPA

VL-JEPA (Vision-Language Joint Embedding Predictive Architecture) in MLX

38
/ 100
Emerging

This project helps machine learning researchers and practitioners understand and experiment with a specific type of AI model called Vision-Language Joint Embedding Predictive Architecture (VL-JEPA). It takes existing Vision-Language Models (like PaliGemma) and reframes them into a JEPA structure, which can lead to more efficient and robust learning. The output is a working example of this architecture, allowing for deeper insight into its mechanics and potential.

Use this if you are an AI researcher or machine learning engineer looking to explore advanced self-supervised learning architectures, specifically VL-JEPA, using the Apple MLX framework.

Not ideal if you are looking for a plug-and-play solution for general image or text analysis, or if you are not familiar with machine learning model architectures and frameworks.

AI-research machine-learning-engineering self-supervised-learning vision-language-models neural-network-architectures
No Package No Dependents
Maintenance 6 / 25
Adoption 9 / 25
Maturity 13 / 25
Community 10 / 25

How are scores calculated?

Stars

76

Forks

6

Language

Python

License

Apache-2.0

Last pushed

Dec 31, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/JosefAlbers/VL-JEPA"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.