SkyworkAI/Skywork
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc.
This project offers a collection of pre-trained AI models that can understand and generate text, perform complex mathematical calculations, engage in conversations, and even interpret images. It takes raw data, text, or images as input and produces high-quality, relevant outputs. These models are ideal for data scientists, machine learning engineers, and researchers looking to build or enhance AI-powered applications, especially those requiring strong Chinese and English language capabilities.
1,491 stars. No commits in the last 6 months.
Use this if you need powerful, open-source language models with exceptional performance in multilingual understanding, creative writing, or advanced mathematics.
Not ideal if you are looking for a plug-and-play end-user application rather than foundational AI models and datasets for development.
Stars
1,491
Forks
145
Language
Python
License
—
Category
Last pushed
Mar 07, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/SkyworkAI/Skywork"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
mikahama/uralicNLP
An NLP library for Uralic languages such as Finnish, Skolt Sami, Moksha and so on. Also...
gia-uh/lingo
A Python library for context engineering.
shamspias/lexsublm-lite
A laptop‑friendly toolkit for context‑aware single‑word paraphrasing and lexical‑substitution...
AragonerUA/SampoNLP
A corpus-free toolkit for morphological lexicon creation and tokenizer evaluation using...
jiangnanboy/llm_corpus_quality
大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning