wangyuxinwhy/uniem
unified embedding model
This project helps you convert Chinese text into numerical representations called embeddings, which are crucial for many natural language processing tasks. It takes raw Chinese text as input and produces high-quality numerical vectors that capture the meaning of the text. This is designed for AI developers, researchers, or data scientists working with Chinese language applications.
876 stars. No commits in the last 6 months.
Use this if you need to generate high-quality, general-purpose embeddings for Chinese text, or if you want to fine-tune existing embedding models with your own data for specific Chinese language tasks.
Not ideal if you primarily work with non-Chinese languages or if you need a solution for tasks beyond text embedding, classification, or retrieval.
Stars
876
Forks
72
Language
Python
License
Apache-2.0
Category
Last pushed
Sep 01, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/wangyuxinwhy/uniem"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
harmonydata/harmony
The Harmony Python library: a research tool for psychologists to harmonise data and...
yannvgn/laserembeddings
LASER multilingual sentence embeddings as a pip package
embeddings-benchmark/results
Data for the MTEB leaderboard
Hironsan/awesome-embedding-models
A curated list of awesome embedding models tutorials, projects and communities.