microsoft/XPretrain

Multi-modality pre-training

41
/ 100
Emerging

This project provides advanced pre-trained models that help AI developers build systems capable of understanding and generating content from both video and language, or image and language, simultaneously. It takes large datasets of diverse videos and their descriptions, or images and their descriptions, and outputs powerful models ready for specialized tasks. AI researchers and machine learning engineers working on cutting-edge applications would use this.

510 stars. No commits in the last 6 months.

Use this if you are an AI developer looking to leverage state-of-the-art multi-modal pre-trained models for tasks involving understanding or generating content from combined video and text, or image and text.

Not ideal if you are looking for a ready-to-use application or a tool for general content creation without deep machine learning expertise.

AI development video analysis image analysis natural language processing machine learning research
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

510

Forks

36

Language

Python

License

Last pushed

May 08, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/microsoft/XPretrain"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.