UCSC-VLAA/Sight-Beyond-Text
[TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"
This project offers pre-trained components to make large language models (LLMs) more truthful and ethical by incorporating visual information. It takes existing LLMs and image datasets to produce an enhanced model that understands and processes both text and images more responsibly. AI researchers and developers working on safer, more reliable AI systems would use this.
No commits in the last 6 months.
Use this if you are a researcher or developer aiming to improve the trustworthiness and ethical behavior of your language models by adding multimodal capabilities.
Not ideal if you are looking for a ready-to-use, consumer-facing AI application, as this project provides research-focused components for model development.
Stars
20
Forks
1
Language
Python
License
Apache-2.0
Category
Last pushed
Sep 15, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/UCSC-VLAA/Sight-Beyond-Text"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
KimMeen/Time-LLM
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming...
om-ai-lab/VLM-R1
Solve Visual Understanding with Reinforced VLMs
bytedance/SALMONN
SALMONN family: A suite of advanced multi-modal LLMs
NVlabs/OmniVinci
OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.
fixie-ai/ultravox
A fast multimodal LLM for real-time voice