jfilter/hyperhyper

🧮 Python package to construct word embeddings for small data using PMI and SVD

/ 100

Emerging

This tool helps researchers and analysts studying text data by creating meaningful word embeddings, even when you only have a small amount of text. It takes your domain-specific text files as input and outputs a list of words, each represented by a numerical vector. You would use this if you need to understand the relationships between words in specialized content like medical journals or legal documents.

No commits in the last 6 months. Available on PyPI.

Use this if you need to analyze the meaning and relationships of words in a specific, often small, collection of text data and want consistent results.

Not ideal if you are working with extremely large, generic text datasets where pre-trained word embeddings or other large-scale methods like Word2vec are more suitable.

text-analysis natural-language-processing corpus-linguistics semantic-analysis information-retrieval

Stale 6m

Maintenance 0 / 25

Adoption 6 / 25

Maturity 25 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

BSD-2-Clause

Featured in

Embeddings Are Easier Than Whatever You're Doing Instead You're Shipping AI You Can't Measure

Higher-rated alternatives

embeddings-benchmark/mteb

MTEB: Massive Text Embedding Benchmark

harmonydata/harmony

The Harmony Python library: a research tool for psychologists to harmonise data and...

yannvgn/laserembeddings

LASER multilingual sentence embeddings as a pip package

embeddings-benchmark/results

Data for the MTEB leaderboard

Hironsan/awesome-embedding-models

A curated list of awesome embedding models tutorials, projects and communities.

Explore Embedding Tools

All categories Trending Embeddings directory Insights