MIV-XJTU/JanusVLN

[ICLR2026] Official implementation for "JanusVLN: Decoupling Semantics and Spatiality with Dual Implicit Memory for Vision-Language Navigation"

/ 100

Emerging

This project helps create AI agents that can navigate complex indoor environments based on natural language instructions. You provide the AI with a written command, like "Go past the kitchen and turn left into the living room," and a 3D map of the space. The AI then plans and executes a path through the virtual environment. This is for researchers and developers working on embodied AI, robotics, and virtual reality applications.

508 stars.

Use this if you are developing or evaluating AI models that need to understand spatial relationships and follow human-like directions within simulated 3D spaces.

Not ideal if you need a pre-built navigation system for real-world robots or applications outside of research on vision-language navigation.

Embodied AI Robotics Simulation Virtual Environment Navigation Natural Language Understanding Spatial Cognition

No License No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 7 / 25

Community 14 / 25

How are scores calculated?

Stars

508

Forks

Language

Python

License

—

Higher-rated alternatives

KimMeen/Time-LLM

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming...

om-ai-lab/VLM-R1

Solve Visual Understanding with Reinforced VLMs

bytedance/SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

NVlabs/OmniVinci

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.

fixie-ai/ultravox

A fast multimodal LLM for real-time voice

Explore Transformer Models

All categories Trending Transformer directory Insights