noobiegz/cw2vec
Implementation of the cw2vec model
This helps Chinese language practitioners create better semantic search, recommendation, or text analysis systems. It takes a large collection of Chinese text and produces numerical representations for each word, enhancing how well computers understand and group related Chinese terms based on both meaning and character structure. This is for data scientists or NLP engineers working with Chinese textual data.
No commits in the last 6 months.
Use this if you need to generate high-quality, context-aware word embeddings specifically for Chinese text, especially when traditional methods fall short due to the unique characteristics of Chinese characters.
Not ideal if you primarily work with English or other non-Chinese languages, or if you need the absolute fastest training time and are not concerned with leveraging character-level stroke information for Chinese.
Stars
29
Forks
15
Language
Python
License
—
Category
Last pushed
Jul 20, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/noobiegz/cw2vec"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
shibing624/text2vec
text2vec, text to vector....
predict-idlab/pyRDF2Vec
đ Python Implementation and Extension of RDF2Vec
IntuitionEngineeringTeam/chars2vec
Character-based word embeddings model based on RNN for handling real world texts
IITH-Compilers/IR2Vec
Implementation of IR2Vec, LLVM IR Based Scalable Program Embeddings
ddangelov/Top2Vec
Top2Vec learns jointly embedded topic, document and word vectors.