encord-team/ebind

A 5-way embedding model for text, audio, image, video, and 3D point clouds.

38
/ 100
Emerging

This project helps you compare and relate data across different types, such as text descriptions, images, video clips, audio recordings, and 3D models. It takes these varied inputs and translates them into a universal format, allowing you to find similarities between, for example, a picture of a dog and an audio recording of a dog barking. This is ideal for researchers and machine learning engineers working with diverse media.

Use this if you need to understand the relationships and similarities between different kinds of media data, like matching a product image to its spoken description or finding videos related to a 3D model.

Not ideal if your project only involves a single type of data or if you need to analyze relationships within one specific modality.

multimodal-search cross-modal-retrieval content-understanding media-analysis 3d-data-processing
No Package No Dependents
Maintenance 6 / 25
Adoption 5 / 25
Maturity 13 / 25
Community 14 / 25

How are scores calculated?

Stars

11

Forks

3

Language

Python

License

Last pushed

Nov 13, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/encord-team/ebind"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.