thustorage/PipeANN

A low-latency, billion-scale, and updatable graph-based vector store on SSD.

55
/ 100
Established

Need to quickly find similar items from an extremely large collection of data, even billions of entries, without needing massive amounts of expensive memory? This solution allows you to search through vast datasets stored on SSDs with ultra-low latency. It helps data scientists, machine learning engineers, and researchers perform lightning-fast similarity searches, taking in high-dimensional vector data and providing the closest matching vectors and their identifiers.

102 stars.

Use this if you need to perform high-throughput, low-latency similarity searches on a dataset of millions to billions of vectors that you need to be able to update regularly, all while minimizing expensive in-memory storage.

Not ideal if your dataset is very small, or if your primary concern is an absolute minimum latency where even a millisecond difference is critical and cost is no object for in-memory solutions.

vector-search large-scale-indexing similarity-matching machine-learning-infrastructure data-retrieval
No Package No Dependents
Maintenance 10 / 25
Adoption 9 / 25
Maturity 15 / 25
Community 21 / 25

How are scores calculated?

Stars

102

Forks

37

Language

Jupyter Notebook

License

MIT

Last pushed

Feb 04, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/vector-db/thustorage/PipeANN"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.