open-chinese/poetry-collection
中文《诗歌总集》,距今为止最全面,最系统的中文诗词数据集,统一数据建模.
This project offers the most comprehensive and systematically organized collection of Chinese poetry, including poems, lyrics, and prose from various dynasties, from the Classic of Poetry to the Qing Dynasty. It takes raw, unorganized textual data of Chinese poetry and standardizes it into a unified JSON format, making it easy to use for analysis. Researchers, educators, and language enthusiasts focused on classical Chinese literature would find this invaluable.
Use this if you need a high-quality, systematically structured dataset of Chinese classical poetry for research, educational purposes, or computational analysis.
Not ideal if you are looking for modern Chinese poetry or need extensive scholarly annotations and contextual information beyond basic appreciation and background.
Stars
37
Forks
7
Language
Python
License
MIT
Category
Last pushed
Jan 06, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/open-chinese/poetry-collection"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
monarch-initiative/ontogpt
LLM-based ontological extraction tools, including SPIRES
weAIDB/awesome-data-llm
Official Repository of "LLM × DATA" Survey Paper
AXYZdong/AMchat
AM (Advanced Mathematics) Chat is a large language model that integrates advanced mathematical...
skywalker023/sodaverse
🥤🧑🏻🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with...
Y-Research-SBU/TimeSeriesScientist
Official Repository for TimeSeriesScientist