ThoughtRiver/lmdb-embeddings

Fast word vectors with little memory usage in Python

/ 100

Emerging

This tool helps machine learning engineers and data scientists efficiently manage large collections of word vectors. It takes your existing word embedding models (like GloVe or Word2Vec) and stores them in a special database format. The result is significantly faster access to individual word vectors and much lower memory usage, even with massive models, making your applications more responsive.

416 stars. No commits in the last 6 months.

Use this if you are building applications that require quick lookups of word embeddings and want to reduce the memory footprint of large language models, especially when running on systems with limited RAM or needing instant access.

Not ideal if you primarily work with small word embedding models or don't experience performance bottlenecks related to embedding loading times or memory consumption.

natural-language-processing machine-learning-engineering data-science computational-linguistics text-analytics

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

416

Forks

Language

Python

License

GPL-3.0

Higher-rated alternatives

shibing624/text2vec

text2vec, text to vector....

predict-idlab/pyRDF2Vec

🐍 Python Implementation and Extension of RDF2Vec

IntuitionEngineeringTeam/chars2vec

Character-based word embeddings model based on RNN for handling real world texts

IITH-Compilers/IR2Vec

Implementation of IR2Vec, LLVM IR Based Scalable Program Embeddings

ddangelov/Top2Vec

Top2Vec learns jointly embedded topic, document and word vectors.

Explore Embedding Tools

All categories Trending Embeddings directory Insights