OpenGVLab/Instruct2Act

Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model

/ 100

Emerging

Instruct2Act helps robotics engineers and researchers translate complex, multi-modal instructions (like "Put the polka dot block into the green container" combined with pointing gestures) into precise, sequential actions for robotic arms. It takes in human-like commands, visual cues, and robot operating environments, then outputs the necessary code to execute those commands, making robotic manipulation tasks more intuitive. This is for professionals working with robotic systems that need to perform varied physical tasks.

373 stars. No commits in the last 6 months.

Use this if you need to program robotic arms to perform manipulation tasks based on natural language and visual input, without writing all the low-level code yourself.

Not ideal if your robotic tasks are fixed, repetitive, and don't require dynamic interpretation of diverse, high-level commands.

robotic-manipulation industrial-automation human-robot-interaction task-planning robot-programming

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 12 / 25

How are scores calculated?

Stars

373

Forks

Language

Python

License

—

Higher-rated alternatives

MantisAI/sieves

Plug-and-play document AI with zero-shot models.

xiaoya-li/Instruction-Tuning-Survey

Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`

TencentARC-QQ/TagGPT

TagGPT: Large Language Models are Zero-shot Multimodal Taggers

rafaelpierre/bullet

bullet: A Zero-Shot / Few-Shot Learning, LLM Based, text classification framework

amazon-science/adaptive-in-context-learning

AdaICL: Which Examples to Annotate of In-Context Learning? Towards Effective and Efficient Selection

Explore LLM Tools

All categories Trending LLM Tool directory Insights