selimfirat/bilkent-turkish-writings-dataset
Compilation of Turkish writings dataset that promotes creativity, content, composition, grammar, spelling and punctuation.
This is a collection of over 9,000 Turkish creative writing samples from university students, gathered from courses focused on developing composition, grammar, spelling, and punctuation. It provides raw text entries, often with instructor feedback, to help researchers and educators analyze Turkish language development and creative expression. The dataset is ideal for linguists, educational researchers, or anyone studying Turkish natural language processing.
No commits in the last 6 months.
Use this if you need a large, categorized dataset of real-world Turkish student writings for linguistic analysis, educational research, or developing AI models for Turkish text.
Not ideal if you need informal Turkish text, conversational data, or a dataset for commercial use, as it's specifically for academic purposes.
Stars
54
Forks
4
Language
Python
License
—
Category
Last pushed
May 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/selimfirat/bilkent-turkish-writings-dataset"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hltcoe/turkle
Django-based clone of Amazon's Mechanical Turk service running in your local environment.
emres/turkish-deasciifier
Turkish deasciifier in Python based on Deniz Yüret's turkish-mode for Emacs
brolin59/trnlp
TÜRKÇE İÇİN DOĞAL DİL İŞLEME ARAÇLARI
ooguz/turkce-kufur-karaliste
Türkçe için bir kara liste (blacklist)
ahmetaa/zemberek-nlp
NLP tools for Turkish.