hiDaDeng/Chinese-Pretrained-Word-Embeddings

中文文本分析工具、语料、预训练模型相关资源汇总。

/ 100

Emerging

This project offers a collection of pre-trained Chinese word embedding models (GloVe and Word2Vec) derived from diverse Chinese text sources such as government reports, court judgments, stock market annual reports, and consumer reviews. It takes raw Chinese text corpora as input and provides downloadable word embedding models. Professionals in social sciences, market research, and legal analysis can use these models to understand semantic relationships and context within their specialized Chinese texts.

144 stars. No commits in the last 6 months.

Use this if you need pre-trained word embeddings for Chinese text analysis in fields like social science, legal research, or market analysis, and want models specifically trained on relevant Chinese corpora.

Not ideal if your primary text data is in a language other than Chinese or if you require highly specialized embeddings for domains not covered by the existing corpora.

Chinese-text-analysis social-sciences-research legal-document-analysis market-research public-opinion-analysis

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

144

Forks

Language

—

License

MIT

Higher-rated alternatives

NateScarlet/holiday-cn

📅🇨🇳中国法定节假日数据自动每日抓取国务院公告

sagorbrur/bnlp

BNLP is a natural language processing toolkit for Bengali Language.

brightmart/nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

esbatmop/MNBVC

MNBVC(Massive Never-ending BT Vast Chinese...

houbb/sensitive-word

👮‍♂️The sensitive word tool for java.(敏感词/违禁词/违法词/脏词。基于 DFA 算法实现的高性能 java...

Explore NLP Tools

All categories Trending NLP directory Insights