ThoughtRiver/lmdb-embeddings

Fast word vectors with little memory usage in Python

40
/ 100
Emerging

This tool helps machine learning engineers and data scientists efficiently manage large collections of word vectors. It takes your existing word embedding models (like GloVe or Word2Vec) and stores them in a special database format. The result is significantly faster access to individual word vectors and much lower memory usage, even with massive models, making your applications more responsive.

416 stars. No commits in the last 6 months.

Use this if you are building applications that require quick lookups of word embeddings and want to reduce the memory footprint of large language models, especially when running on systems with limited RAM or needing instant access.

Not ideal if you primarily work with small word embedding models or don't experience performance bottlenecks related to embedding loading times or memory consumption.

natural-language-processing machine-learning-engineering data-science computational-linguistics text-analytics
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 14 / 25

How are scores calculated?

Stars

416

Forks

31

Language

Python

License

GPL-3.0

Last pushed

Jun 26, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/ThoughtRiver/lmdb-embeddings"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.