iflytek/cino

CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)

44
/ 100
Emerging

This project offers pre-trained language models specifically designed for processing text in Chinese minority languages and dialects. It takes raw text in languages like Tibetan, Mongolian, Uyghur, Kazakh, Korean, Zhuang, and Cantonese, and outputs a deeper understanding of the language, which can then be used for tasks like text classification. This tool is for researchers, linguists, or content managers working with these specific languages who need to analyze or process large volumes of text.

262 stars. No commits in the last 6 months.

Use this if you need to build applications or conduct research that accurately understands and processes text in Chinese minority languages like Tibetan, Mongolian, Uyghur, or Cantonese.

Not ideal if your primary focus is on processing standard Mandarin Chinese or other widely spoken global languages, as other models may be more efficient or comprehensive for those languages.

minority-language-processing text-analysis language-understanding linguistics-research content-categorization
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 16 / 25

How are scores calculated?

Stars

262

Forks

32

Language

Python

License

Apache-2.0

Last pushed

Jul 15, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/iflytek/cino"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.