kyegomez/Pegasus
PegasusX: The Future of Multimodal Embeddings 🦄 🦄
This project helps anyone working with diverse data types by turning text, images, video, and audio into 'embeddings' – numerical representations that capture the meaning of the data. You put in your raw data, and it gives you back these smart embeddings, which can then be used for tasks like searching similar content or classifying information. It's designed for researchers, data analysts, or machine learning practitioners who need to process and understand complex, mixed-media datasets.
No commits in the last 6 months.
Use this if you need to convert various types of data (text, images, audio, video) into a unified, meaningful numerical format for analysis or machine learning applications.
Not ideal if you are looking for an out-of-the-box application rather than a tool to generate data representations for further model development.
Stars
14
Forks
5
Language
Python
License
Apache-2.0
Category
Last pushed
Oct 16, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/kyegomez/Pegasus"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ssrajadh/sentrysearch
Semantic search over videos using Gemini Embedding 2.
hayabhay/frogbase
Transform audio-visual content into navigable knowledge.
zilliz-bootcamp/audio_search
This project use PANNs for audio tagging and sound event detection, and finally get audio...
tomfalainen/word_spotting
Semantic and Verbatim Word Spotting in Torch
ashvardanian/SwiftSemanticSearch
Real-time on-device text-to-image and image-to-image Semantic Search with video stream camera...