Word Embedding Implementations Embedding Tools

Custom implementations and training of word embedding models (Word2Vec, binary embeddings, etc.) from scratch or on specific datasets. Does NOT include pretrained models, sentence embeddings, or downstream applications of embeddings.

There are 127 word embedding implementations tools tracked. 6 score above 50 (established tier). The highest-rated is shibing624/text2vec at 65/100 with 4,950 stars.

Get all 127 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=embeddings&subcategory=word-embedding-implementations&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 shibing624/text2vec

text2vec, text to vector....

65
Established
2 predict-idlab/pyRDF2Vec

🐍 Python Implementation and Extension of RDF2Vec

58
Established
3 IntuitionEngineeringTeam/chars2vec

Character-based word embeddings model based on RNN for handling real world texts

56
Established
4 IITH-Compilers/IR2Vec

Implementation of IR2Vec, LLVM IR Based Scalable Program Embeddings

56
Established
5 ddangelov/Top2Vec

Top2Vec learns jointly embedded topic, document and word vectors.

56
Established
6 natasha/navec

Compact high quality word embeddings for Russian language

51
Established
7 dalinvip/cw2vec

cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information

49
Emerging
8 stephantul/reach

Load embeddings and featurize your sentences.

48
Emerging
9 pnpnpn/dna2vec

dna2vec: Consistent vector representations of variable-length k-mers

48
Emerging
10 jaanli/food2vec

:hamburger:

48
Emerging
11 oborchers/Fast_Sentence_Embeddings

Compute Sentence Embeddings Fast!

46
Emerging
12 tca19/dict2vec

Dict2vec is a framework to learn word embeddings using lexical dictionaries.

46
Emerging
13 lgalke/vec4ir

Word Embeddings for Information Retrieval

46
Emerging
14 persiyanov/skip-thought-tf

An implementation of skip-thought vectors in Tensorflow

45
Emerging
15 wikipedia2vec/wikipedia2vec

A tool for learning vector representations of words and entities from Wikipedia

45
Emerging
16 bnosac/doc2vec

Distributed Representations of Sentences and Documents

45
Emerging
17 brannondorsey/GloVe-experiments

GloVe word vector embedding experiments (similar to Word2Vec)

43
Emerging
18 CyberZHG/keras-word-char-embd

Concatenate word and character embeddings in Keras

41
Emerging
19 aihpi/workshop-nlp-embeddings

Code for the KISZ-BB Workshop series "Working with embeddings"

41
Emerging
20 clips/dutchembeddings

Repository for the word embeddings experiments described in "Evaluating...

41
Emerging
21 fnielsen/wembedder

Wikidata embedding

41
Emerging
22 jaredwinick/img2vec-keras

Image to dense vector embedding. Clone of...

41
Emerging
23 ThoughtRiver/lmdb-embeddings

Fast word vectors with little memory usage in Python

40
Emerging
24 pommedeterresautee/fastrtext

R wrapper for fastText

40
Emerging
25 bnosac/word2vec

Distributed Representations of Words using word2vec

39
Emerging
26 dwslab/jRDF2Vec

A high-performance Java Implementation of RDF2Vec

39
Emerging
27 md-mq/philo2vec

An implementation of word2vec applied to [stanford philosophy...

39
Emerging
28 vecto-ai/vecto

Doing things with embeddings

39
Emerging
29 joisino/wordtour

Code for "Word Tour: One-dimensional Word Embeddings via the Traveling...

38
Emerging
30 MirunaPislar/emoji2vec

Train emoji embeddings based on emoji descriptions.

38
Emerging
31 midi-ld/midi2vec

MIDI2vec computes embeddings for representing MIDI data in vector space

38
Emerging
32 sismetanin/word2vec-tsne

Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.

37
Emerging
33 FraLotito/pytorch-continuous-bag-of-words

The Continuous Bag-of-Words model (CBOW) is frequently used in NLP deep...

37
Emerging
34 neuml/staticvectors

🔢 Work with static vector models

36
Emerging
35 hassyGo/charNgram2vec

Pre-training character n-gram embeddings

36
Emerging
36 PlanTL-GOB-ES/Biomedical-Word-Embeddings-for-Spanish

Biomedical Word embeddings generated from Spanish Biomedical corpora.

36
Emerging
37 stevend94/Feature2Vec

Code used in the paper, Feature2Vec: Distributional semantic modelling of...

35
Emerging
38 zgornel/Glowe.jl

Julia interface to GloVe

35
Emerging
39 Rj7/Unsupervised-morphology-induction-word2vec

Implementation of Unsupervised Morphology Induction Using Word Embeddings

34
Emerging
40 arsena-k/Word2Vec-bias-extraction

How are words loaded with meaning? Repository accompanying research by...

34
Emerging
41 cmasch/word-embeddings-from-scratch

Creating word embeddings from scratch and visualize them on TensorBoard....

34
Emerging
42 franciszekparma/Word2Vec

From-scratch Word2Vec (skip-gram with negative sampling) fully implemented in PyTorch

34
Emerging
43 zgornel/ConceptnetNumberbatch.jl

Julia API for ConceptNetNumberbatch

33
Emerging
44 noobiegz/cw2vec

Implementation of the cw2vec model

33
Emerging
45 zhaojishun/GenderBiasPapers

Must-read Papers on Gender Bias.

33
Emerging
46 NURx2/pycode2vec

The tool for getting embeddings of Python 3 code chunks

33
Emerging
47 marmarelis/QDiffusion.jl

Leveraging the full dimensionality of single-cell transcriptomics (among...

32
Emerging
48 maxent-ai/lda2vec

Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this...

32
Emerging
49 brannondorsey/ChessEmbeddings

GloVe vector embeddings of chess moves

32
Emerging
50 roopalgarg/brand_embedding

Generate word embeddings for commercial brand names to study similarity between them.

32
Emerging
51 Santosh-Gupta/Research2Vec

Representing research papers as vectors / latent representations.

32
Emerging
52 ninalx/table2vec-lideng

Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval

31
Emerging
53 zgornel/EmbeddingsAnalysis.jl

A package for embeddings processing

31
Emerging
54 Rajspeaks/Deep-Learning-Approach-to-Bengali-Word-Embedding-using-BengaliWord2Vec-from-BNLP

Bengali word embedding using BengaliWord2Vec from BNLP. A mini project under...

31
Emerging
55 Kirili4ik/code2vec

code2vec for Python 3 made for NL2ML project

30
Emerging
56 chanind/word2vec-gender-bias-explorer

A tool to show gender bias in words based on NLP word embeddings from Google News

30
Emerging
57 w-zm/python-sentence2vec

This tool provides some implementations of sentence to vector. (sentence2vec)

30
Emerging
58 dr-irani/Quantifying-Bias-Contextualized-Embeddings

Semester project for Machine Learning: Deep Learning, Spring 2020

30
Emerging
59 sshh12/Voice-Vector

A one-shot siamese approach to generating voice embeddings.

30
Emerging
60 MahmoudAbdelRahman/build2Vec

Building representation in the vector space

29
Experimental
61 BotCenter/spanishWordEmbeddings

Spanish Word Embeddings computed from large corpora and different sizes...

29
Experimental
62 CharlesGaydon/Dater-to-Vec

Collaborative filtering in dating. A NLP-based user embedding approach...

29
Experimental
63 jolivaresc/fastText-vecmap

bilingual word embeddings mapping using fastText

29
Experimental
64 brangerbriz/midi-glove

Create MIDI note vector embeddings using GloVe (Global Vectors for Word...

28
Experimental
65 ksalama/data2cooc2emb2ann

Learning embeddings from item co-occurrence statistics, and building an...

28
Experimental
66 ChristophAlt/pytorch-starspace

PyTorch implementation of StarSpace as described in "StarSpace: Embed All...

28
Experimental
67 ccmaymay/word2vec

word2vec, commented

28
Experimental
68 MartinoMensio/it_vectors_wiki_spacy

Word embeddings for Italian language, spacy2 prebuilt model

28
Experimental
69 warchildmd/game2vec

TensorFlow implementation of word2vec applied on...

27
Experimental
70 danielcieslinski/curve2vec

Python package for generating vector embeddings of curves

27
Experimental
71 cr0wley-zz/Embeddings

A study on the ingenious concept of word2vec. The repository contains a...

27
Experimental
72 Abhinavexists/Vectorlake

Trying to build embedding from Scratch

26
Experimental
73 eifuentes/skipgrammar

A framework for representing sequences as embeddings.

26
Experimental
74 YannDubs/RAW-Embedings

Novel word embeddings based on a simple and intuitive rolling average. Still...

24
Experimental
75 SmartData-Polito/darkvec

This repo contains the codes and the notebooks used for the paper "DarkVec:...

24
Experimental
76 rosasalberto/image2vec

Building applications on top of Image Embeddings. Recommendation Engine,...

23
Experimental
77 menon92/Bangla-Word2Vec

Bangla word2vec using skipgram approach

22
Experimental
78 worldbeater/code-vecs

Code for the methods and algorithms described in the paper "Analysis of...

22
Experimental
79 vsoch/wikipedia-equations

word2vec embeddings for statistics and math equations from Wikipedia

22
Experimental
80 srijansood/debias-word-embeddings

Tackling Gender and Race bias in Word Embeddings

22
Experimental
81 danaugrs/binary-word-embeddings

Generates binary word embeddings by analyzing Wikipedia

21
Experimental
82 mayankkejriwal/Geonames-embeddings

Embeddings for all geonames populated locations with population greater than 0

21
Experimental
83 chr1sbest/word2vec_explorer

Interactive REPL for exploring word2vec word embeddings - demonstrates the...

21
Experimental
84 dkaslovsky/Not-Word2Vec

This is not word2vec

21
Experimental
85 nadinejackson1/word-embeddings-visualization

Visualizing word embeddings generated by GloVe and Word2Vec models using the...

20
Experimental
86 nocotan/skipgram_cpp

Skipgram with Hierarchical Softmax

20
Experimental
87 maxi-w/image-vectors

Embed images easily

20
Experimental
88 Hellisotherpeople/debate2vec

Word-vectors created from a large corpus of competative debate evidence

19
Experimental
89 CentreForDigitalHumanities/Word2VecElastic

Collect sentences from ElasticSearch, preprocess and train diachronic Word2Vec models

19
Experimental
90 boyanangelov/species2vec

Species vector representations

19
Experimental
91 BotCenter/spanish-sent2vec

Spanish Sentence Embeddings computed from large corpora using sent2vec.

19
Experimental
92 AnasMohammad4321/Word2Vec-Pytorch

Implementation of Word2Vec for learning word embeddings using the Amazon...

18
Experimental
93 Koziev/word_embedders

Character-level autoencoder models for words

18
Experimental
94 AaruranLog/Analogies

Analogy solver using Google's pretrained word vectors

17
Experimental
95 gumblex/lmtvec

Low Memory Text Vector

17
Experimental
96 LoicGrobol/fasttextlt

A pure Python FastText interface, to ensure that FastText model stay usable...

17
Experimental
97 Marwolaeth/EmbeddingsTools.jl

Extra tools for working with word embeddings, such as those in...

17
Experimental
98 japgarrido/Word2Vec-Embedding-Analysis

This project focuses on implementing and analyzing word embeddings using the...

17
Experimental
99 mantzaris/TextSpace.jl

A Julia package for text embeddings and related NLP transformations

15
Experimental
100 ben300694/word-embeddings

Repository for the seminar "Word Embedding Spaces", Master CS and Master AI...

15
Experimental
101 hammi03/word2vec-numpy

Skip-gram Word2Vec with Negative Sampling in pure NumPy, trained on text8

14
Experimental
102 undeluro/word2vec

Implementation of the core word2vec training loop in pure numpy.

14
Experimental
103 nisaharan/vector-embeddings-workshop-vavuniya

Intro to word embeddings & semantic search – workshop at University of Vavuniya

13
Experimental
104 mahb97/Wake2vec

Controlled style shift to Joyce via embedding surgery and Wake lexicon

13
Experimental
105 acd17sk/Word2vec-CBOW-Negative-Sampling

This project provides a pure NumPy implementation of the Word2vec Continuous...

13
Experimental
106 remeinium/Uganna_Siyabasa

A FastText Embedding model trained on Sinhala language.

13
Experimental
107 lapis-zero09/compare_word_embedding

意味表現学習

13
Experimental
108 danjohnvelasco/Filipino-Word-Embeddings

This repository contains download links to pretrained static word embeddings...

13
Experimental
109 vamsivallepu/Telugu-W2V

Word-2-Vec embeddings trained on Telugu text corpus

13
Experimental
110 Ashly1991/word2vec-tf2

Word2Vec Skipgram with negative sampling in TensorFlow 2. Self-supervised...

13
Experimental
111 rtlee9/state-of-the-union

Paragraph vector analysis of state of the union addresses

13
Experimental
112 rtadijar/wiki2vec

A Wikipedia Article Embedding

11
Experimental
113 diixo/fasttextCC

fastText v0.9.3 (C++ port)

11
Experimental
114 claudiu1989/Synonyms-detection

Experiments with word2vec embeddings for synonyms detection, for the...

11
Experimental
115 varunvasudeva1/wiki-kb

A tool to easily create Wikipedia-based embedding-ready knowledge bases for...

11
Experimental
116 claudiu1989/Words-embeddings-for-Romanian-language

Code for generating word2vec embeddings for the Romanian language, and some...

11
Experimental
117 AlexisTercero55/word2vec_TMB

Word2Vec C model by Tomas Mikolov from svn Google's repo.

11
Experimental
118 ManasBarman229/Assamese-GloVe-Embedding-Model

Assamese GloVe Embedding Model: Pre-trained models generated from a large...

11
Experimental
119 sindre0830/Word-Vectors

This repository implements different architectures for training word embeddings.

11
Experimental
120 arsena-k/Exploring_WordEmbeddings

Intro to Word Embeddings and Applications

10
Experimental
121 hi-primus/doc2use

A library to generate embeddings for Javascript and Python code, so you can...

10
Experimental
122 FranzDiebold/embeddings-talk

Embeddings Talk "The Wonderful World of Embeddings"

10
Experimental
123 iafarhan/skipgram-word2vec

Iteration based method to learn Word Vectors. Word2vec is a method whose...

10
Experimental
124 EsterHlav/Contextualized-Word-Vectors-CoVe-Learned-in-Translation

CoVe embedding training from-scratch using biLSTM with attention with...

10
Experimental
125 nphdang/Trans2Vec

Learning embeddings for transactions via frequent itemsets, Word2Vec, and Doc2Vec

10
Experimental
126 aditeyabaral/word2vec-c

Finding Distributed Representations of Words using C - Implementation of...

10
Experimental
127 ArevikKH/Armenian-Word2Vec

Train a custom Word2Vec model on Armenian news articles using the...

10
Experimental