ImKeTT/ReSee
[EMNLP'23 Oral] ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue PyTorch Implementation
This project helps conversational AI developers create more engaging and knowledgeable chatbots by enabling them to 'see' and integrate specific visual information during a conversation. It takes in dialogue data paired with fine-grained visual features (like entity-level images) and produces a chatbot that can generate more relevant and visually grounded responses. Developers building advanced dialogue systems would use this to enhance their bots' contextual understanding.
No commits in the last 6 months.
Use this if you are a developer building open-domain dialogue systems and want to incorporate detailed visual knowledge to make your chatbot responses richer and more contextually aware.
Not ideal if you need a simple chatbot for basic Q&A without any visual understanding, or if you don't have access to pre-processed visual features for your dialogue data.
Stars
13
Forks
—
Language
Python
License
—
Category
Last pushed
Dec 04, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ImKeTT/ReSee"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NVlabs/MambaVision
[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
sign-language-translator/sign-language-translator
Python library & framework to build custom translators for the hearing-impaired and translate...
kyegomez/Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
autonomousvision/transfuser
[PAMI'23] TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving;...
kyegomez/MultiModalMamba
A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance...