SkyworkAI/Skywork

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc.

/ 100

Emerging

This project offers a collection of pre-trained AI models that can understand and generate text, perform complex mathematical calculations, engage in conversations, and even interpret images. It takes raw data, text, or images as input and produces high-quality, relevant outputs. These models are ideal for data scientists, machine learning engineers, and researchers looking to build or enhance AI-powered applications, especially those requiring strong Chinese and English language capabilities.

1,491 stars. No commits in the last 6 months.

Use this if you need powerful, open-source language models with exceptional performance in multilingual understanding, creative writing, or advanced mathematics.

Not ideal if you are looking for a plug-and-play end-user application rather than foundational AI models and datasets for development.

natural-language-processing machine-learning-engineering AI-research multilingual-content-creation mathematical-problem-solving

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

1,491

Forks

145

Language

Python

License

—

Higher-rated alternatives

mikahama/uralicNLP

An NLP library for Uralic languages such as Finnish, Skolt Sami, Moksha and so on. Also...

gia-uh/lingo

A Python library for context engineering.

shamspias/lexsublm-lite

A laptop‑friendly toolkit for context‑aware single‑word paraphrasing and lexical‑substitution...

AragonerUA/SampoNLP

A corpus-free toolkit for morphological lexicon creation and tokenizer evaluation using...

jiangnanboy/llm_corpus_quality

大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning

Explore NLP Tools

All categories Trending NLP directory Insights