UCSC-VLAA/Sight-Beyond-Text

[TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"

/ 100

Experimental

This project offers pre-trained components to make large language models (LLMs) more truthful and ethical by incorporating visual information. It takes existing LLMs and image datasets to produce an enhanced model that understands and processes both text and images more responsibly. AI researchers and developers working on safer, more reliable AI systems would use this.

No commits in the last 6 months.

Use this if you are a researcher or developer aiming to improve the trustworthiness and ethical behavior of your language models by adding multimodal capabilities.

Not ideal if you are looking for a ready-to-use, consumer-facing AI application, as this project provides research-focused components for model development.

AI ethics Multimodal AI Truthful AI Responsible AI development Large Language Models

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 5 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

KimMeen/Time-LLM

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming...

om-ai-lab/VLM-R1

Solve Visual Understanding with Reinforced VLMs

bytedance/SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

NVlabs/OmniVinci

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.

fixie-ai/ultravox

A fast multimodal LLM for real-time voice

Explore Transformer Models

All categories Trending Transformer directory Insights