kateryna-bobrovnyk/ukr-twi-corpus

A corpus of Ukrainian Twitter texts + instructions for downloading and filtering texts.

/ 100

Experimental

This project provides a large collection of Ukrainian Twitter posts, along with tools to get even more. It's designed for researchers or analysts who need to study social media trends, language use, or public sentiment within Ukrainian online discussions. You'll get a pre-built dataset of Ukrainian tweets and can use provided scripts to expand and refine your own custom collections.

No commits in the last 6 months.

Use this if you are a linguist, social scientist, or data analyst studying Ukrainian language, social media, or public opinion.

Not ideal if you need real-time data or require a corpus for languages other than Ukrainian.

social-media-research linguistics text-analysis public-opinion Ukrainian-studies

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

Helsinki-NLP/OpusFilter

OpusFilter - Parallel corpus processing toolkit

natasha/corus

Links to Russian corpora + Python functions for loading and parsing

darija-open-dataset/dataset

darija <-> english dataset

omicsNLP/Auto-CORPus

Auto-CORPus pipeline developed by a University of Nottingham and Imperial College London...

SergeyShk/ruTS

Библиотека для извлечения статистик из текстов на русском языке.

Explore NLP Tools

All categories Trending NLP directory Insights