LeoMartinezTAMUK/CBOW_NLP_Cosine-Similarity
This project implements Word Embedding using the Continuous Bag of Words (CBOW) method for natural language processing tasks. The program processes PDF files, tokenizes text, trains a Word2Vec model using CBOW, and evaluates the cosine similarity between selected word pairs from the document.
No commits in the last 6 months.
Stars
—
Forks
—
Language
Python
License
MIT
Category
Last pushed
Oct 08, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/LeoMartinezTAMUK/CBOW_NLP_Cosine-Similarity"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
shibing624/similarities
Similarities: a toolkit for similarity calculation and semantic search....
explosion/sense2vec
🦆 Contextually-keyed word vectors
chakki-works/chakin
Simple downloader for pre-trained word vectors
sebischair/Lbl2Vec
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with...
pdrm83/sent2vec
How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.