YRL-AIDA/RuTaBERT

RuTaBERT is a framework for solving column type and property annotation problems based on fine-tuning a pre-trained language model (e.g., BERT) using a large-scale corpus of Russian-language tables.

29
/ 100
Experimental

This project helps data professionals, data scientists, and analysts working with large volumes of Russian-language tables. It automatically identifies the type or category of data within each column (e.g., city, date, product ID). You input a CSV file containing your Russian table data, and it outputs labels for each column, telling you what kind of information it holds. This saves significant manual effort in data preparation and understanding.

No commits in the last 6 months.

Use this if you need to quickly and accurately understand the content and categorize columns within many Russian-language tables without manual inspection.

Not ideal if your tables are primarily in languages other than Russian, or if you require column type annotation for highly specialized, non-standard data types not typically found in general knowledge bases.

data-classification table-analysis data-tagging semantic-annotation Russian-data-processing
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 9 / 25

How are scores calculated?

Stars

7

Forks

1

Language

Python

License

MIT

Last pushed

Mar 27, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/YRL-AIDA/RuTaBERT"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.