whwu95/GPT4Vis

GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?

40
/ 100
Emerging

This project helps researchers and developers understand what GPT-4 and GPT-4V (GPT-4 with Vision) can do for automatically classifying images, videos, and 3D point clouds without needing specific training data for each task. It takes raw visual data and produces category labels or descriptions for those visuals. Anyone working on computer vision tasks who wants to leverage cutting-edge large language models for 'zero-shot' recognition would use this.

185 stars. No commits in the last 6 months.

Use this if you are a researcher or developer exploring the capabilities of GPT-4 for visual recognition across different data types like images, videos, and point clouds, especially for zero-shot tasks where you lack labeled training data.

Not ideal if you are looking for a ready-to-use, production-grade application or a tool that does not require deep technical understanding of large language models and vision APIs.

computer-vision image-recognition video-analysis 3d-point-cloud-classification zero-shot-learning
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 14 / 25

How are scores calculated?

Stars

185

Forks

18

Language

Python

License

MIT

Last pushed

May 22, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/whwu95/GPT4Vis"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.