wangyuxinwhy/uniem

unified embedding model

/ 100

Emerging

This project helps you convert Chinese text into numerical representations called embeddings, which are crucial for many natural language processing tasks. It takes raw Chinese text as input and produces high-quality numerical vectors that capture the meaning of the text. This is designed for AI developers, researchers, or data scientists working with Chinese language applications.

876 stars. No commits in the last 6 months.

Use this if you need to generate high-quality, general-purpose embeddings for Chinese text, or if you want to fine-tune existing embedding models with your own data for specific Chinese language tasks.

Not ideal if you primarily work with non-Chinese languages or if you need a solution for tasks beyond text embedding, classification, or retrieval.

Chinese NLP text classification information retrieval semantic search natural language understanding

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

876

Forks

Language

Python

License

Apache-2.0

Featured in

Embeddings Are Easier Than Whatever You're Doing Instead You're Shipping AI You Can't Measure

Higher-rated alternatives

embeddings-benchmark/mteb

MTEB: Massive Text Embedding Benchmark

harmonydata/harmony

The Harmony Python library: a research tool for psychologists to harmonise data and...

yannvgn/laserembeddings

LASER multilingual sentence embeddings as a pip package

embeddings-benchmark/results

Data for the MTEB leaderboard

Hironsan/awesome-embedding-models

A curated list of awesome embedding models tutorials, projects and communities.

Explore Embedding Tools

All categories Trending Embeddings directory Insights