Shanghai-Digital-Brain-Laboratory/BDM-DB1
A large-scale multi-modal pre-trained model
This project offers a powerful AI model that can understand and generate text, interpret images, and make decisions in complex environments. It takes in various types of information, like natural language instructions, visual data from video games, or problem definitions such as a Traveling Salesperson Problem, and outputs intelligent actions or solutions. It's designed for researchers and engineers exploring advanced AI capabilities across language, vision, and automated decision-making.
134 stars. No commits in the last 6 months.
Use this if you are an AI researcher or engineer working on multi-modal AI and want to experiment with a pre-trained model capable of generalized task performance across text, images, and simulated decision-making.
Not ideal if you are a practitioner looking for a ready-to-use, off-the-shelf solution for a specific business problem without significant AI development expertise.
Stars
134
Forks
10
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 07, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Shanghai-Digital-Brain-Laboratory/BDM-DB1"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
dorarad/gansformer
Generative Adversarial Transformers
j-min/VL-T5
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
invictus717/MetaTransformer
Meta-Transformer for Unified Multimodal Learning
rkansal47/MPGAN
The message passing GAN https://arxiv.org/abs/2106.11535 and generative adversarial particle...
Yachay-AI/byt5-geotagging
Confidence and Byt5 - based geotagging model predicting coordinates from text alone.