itrummer/thalamusdb

ThalamusDB: semantic query processing on multimodal data

/ 100

Emerging

This helps data analysts and researchers query complex datasets that combine text, images, and audio, even if the information isn't perfectly structured. You input a database containing descriptions, image paths, and audio file paths, and get answers to questions like "How many cars in pictures are red?" or "Show me all audio clips with someone speaking." It's for anyone who needs to extract insights from diverse, unstructured media using natural language.

114 stars. No commits in the last 6 months.

Use this if you need to ask complex, semantic questions about a database that includes unstructured data like images, audio, and text, and want to use natural language in your queries.

Not ideal if your data is entirely structured in traditional tables and doesn't require semantic understanding of images, audio, or free-form text.

data-analysis multimedia-search unstructured-data-query research-data-mining content-discovery

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 9 / 25

Maturity 15 / 25

Community 9 / 25

How are scores calculated?

Stars

114

Forks

Language

Python

License

MIT

Higher-rated alternatives

ewok-core/ewok-paper

Elements of World Knowledge! This repository houses data and code needed to replicate our first...

texttron/hyde

HyDE: Precise Zero-Shot Dense Retrieval without Relevance Labels

ArslanKAS/Large-Language-Models-with-Semantic-Search

Explore from keyword search to dense retrieval and reranking, which injects the intelligence of...

Ahren09/SciEvo

A longitudinal dataset for academic literature, including papers, metadata, and citation graphs,...

jzhoubu/vsearch

An Extensible Framework for Retrieval-Augmented LLM Applications: Learning Relevance Beyond...

Explore Embedding Tools

All categories Trending Embeddings directory Insights