hiDaDeng/Chinese-Pretrained-Word-Embeddings
中文文本分析工具、语料、预训练模型相关资源汇总。
This project offers a collection of pre-trained Chinese word embedding models (GloVe and Word2Vec) derived from diverse Chinese text sources such as government reports, court judgments, stock market annual reports, and consumer reviews. It takes raw Chinese text corpora as input and provides downloadable word embedding models. Professionals in social sciences, market research, and legal analysis can use these models to understand semantic relationships and context within their specialized Chinese texts.
144 stars. No commits in the last 6 months.
Use this if you need pre-trained word embeddings for Chinese text analysis in fields like social science, legal research, or market analysis, and want models specifically trained on relevant Chinese corpora.
Not ideal if your primary text data is in a language other than Chinese or if you require highly specialized embeddings for domains not covered by the existing corpora.
Stars
144
Forks
30
Language
—
License
MIT
Category
Last pushed
Sep 12, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/hiDaDeng/Chinese-Pretrained-Word-Embeddings"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NateScarlet/holiday-cn
📅🇨🇳中国法定节假日数据 自动每日抓取国务院公告
sagorbrur/bnlp
BNLP is a natural language processing toolkit for Bengali Language.
brightmart/nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese...
houbb/sensitive-word
👮♂️The sensitive word tool for java.(敏感词/违禁词/违法词/脏词。基于 DFA 算法实现的高性能 java...