InternLM/Spatial-SSRL

[CVPR 2026] Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"

40
/ 100
Emerging

This project helps improve how Large Vision-Language Models (LVLMs) understand the spatial relationships between objects in images and videos. You input ordinary images or video clips, and the project enhances the model's ability to accurately describe locations, sizes, and relative positions without needing special labels. This is ideal for researchers and developers building or evaluating advanced AI vision systems.

116 stars.

Use this if you are a researcher or developer working with LVLMs and need to significantly boost their spatial reasoning capabilities efficiently and without extensive manual data annotation.

Not ideal if you need a plug-and-play application for immediate end-user tasks, as this is a framework for improving underlying model intelligence.

AI model training computer vision research spatial reasoning large vision language models robotics perception
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 13 / 25
Community 7 / 25

How are scores calculated?

Stars

116

Forks

4

Language

Python

License

Apache-2.0

Last pushed

Feb 25, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/InternLM/Spatial-SSRL"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.