OneInterface/realtime-bakllava

llama.cpp with BakLLaVA model describes what does it see

34
/ 100
Emerging

This project helps you understand what is happening in an image or real-time video feed by providing textual descriptions. You feed it a picture or live webcam stream, and it tells you what it "sees" by generating natural language captions. Anyone who needs immediate, descriptive insights from visual information can use this, such as those in accessibility roles or content analysis.

379 stars. No commits in the last 6 months.

Use this if you need a local, real-time solution to generate descriptive text from images or a live webcam feed on an Apple silicon chip.

Not ideal if you need a cloud-based solution, require broad cross-platform support beyond Apple silicon, or need highly specialized object detection.

visual-assistance image-captioning video-description accessibility-tech content-analysis
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 16 / 25

How are scores calculated?

Stars

379

Forks

41

Language

Python

License

Last pushed

Nov 08, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/OneInterface/realtime-bakllava"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.