thustorage/PipeANN

A low-latency, billion-scale, and updatable graph-based vector store on SSD.

/ 100

Established

Need to quickly find similar items from an extremely large collection of data, even billions of entries, without needing massive amounts of expensive memory? This solution allows you to search through vast datasets stored on SSDs with ultra-low latency. It helps data scientists, machine learning engineers, and researchers perform lightning-fast similarity searches, taking in high-dimensional vector data and providing the closest matching vectors and their identifiers.

102 stars.

Use this if you need to perform high-throughput, low-latency similarity searches on a dataset of millions to billions of vectors that you need to be able to update regularly, all while minimizing expensive in-memory storage.

Not ideal if your dataset is very small, or if your primary concern is an absolute minimum latency where even a millisecond difference is critical and cost is no object for in-memory solutions.

vector-search large-scale-indexing similarity-matching machine-learning-infrastructure data-retrieval

No Package No Dependents

Maintenance 10 / 25

Adoption 9 / 25

Maturity 15 / 25

Community 21 / 25

How are scores calculated?

Stars

102

Forks

Language

Jupyter Notebook

License

MIT

Related tools

MariaDB/server

MariaDB server is a community developed fork of MySQL server. Started by core members of the...

AlayaDB-AI/AlayaLite

AlayaLite – A Fast, Flexible Vector Database for Everyone.

infiniflow/infinity

The AI-native database built for LLM applications, providing incredibly fast hybrid search of...

nnethercott/hannoy

Production-ready KV-backed HNSW implementation in Rust using LMDB

dingodb/dingo

A multi-modal vector database that supports upserts and vector queries using unified SQL...

Explore Vector Databases

All categories Trending Vector Database directory Insights