jfilter/hyperhyper
🧮 Python package to construct word embeddings for small data using PMI and SVD
This tool helps researchers and analysts studying text data by creating meaningful word embeddings, even when you only have a small amount of text. It takes your domain-specific text files as input and outputs a list of words, each represented by a numerical vector. You would use this if you need to understand the relationships between words in specialized content like medical journals or legal documents.
No commits in the last 6 months. Available on PyPI.
Use this if you need to analyze the meaning and relationships of words in a specific, often small, collection of text data and want consistent results.
Not ideal if you are working with extremely large, generic text datasets where pre-trained word embeddings or other large-scale methods like Word2vec are more suitable.
Stars
18
Forks
—
Language
Python
License
BSD-2-Clause
Category
Last pushed
Oct 25, 2020
Commits (30d)
0
Dependencies
4
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/jfilter/hyperhyper"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
harmonydata/harmony
The Harmony Python library: a research tool for psychologists to harmonise data and...
yannvgn/laserembeddings
LASER multilingual sentence embeddings as a pip package
embeddings-benchmark/results
Data for the MTEB leaderboard
Hironsan/awesome-embedding-models
A curated list of awesome embedding models tutorials, projects and communities.