kyegomez/Fuyu

Implementation of Adepts Fuyu all-new Multi-Modality model in pytorch

/ 100

Emerging

This project offers a foundational building block for AI developers creating multi-modal applications. It takes raw image data and text sequences, processes them together, and produces an integrated output that can be used for various AI tasks. AI developers working on systems that need to understand both images and text will find this useful.

No commits in the last 6 months. Available on PyPI.

Use this if you are an AI developer looking to integrate a multi-modal model that processes images and text using a transformer decoder architecture.

Not ideal if you are an end-user without programming experience or looking for a ready-to-use application, as this is a developer library.

multi-modal AI development AI model implementation deep learning architecture image-text processing natural language processing

Stale 6m

Maintenance 0 / 25

Adoption 6 / 25

Maturity 25 / 25

Community 10 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

open-mmlab/mmpretrain

OpenMMLab Pre-training Toolbox and Benchmark

facebookresearch/mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

HuaizhengZhang/Awsome-Deep-Learning-for-Video-Analysis

Papers, code and datasets about deep learning and multi-modal learning for video analysis

KaiyangZhou/pytorch-vsumm-reinforce

Unsupervised video summarization with deep reinforcement learning (AAAI'18)

adambielski/siamese-triplet

Siamese and triplet networks with online pair/triplet mining in PyTorch

Explore ML Frameworks

All categories Trending ML Framework directory Insights