YRL-AIDA/RuTaBERT

RuTaBERT is a framework for solving column type and property annotation problems based on fine-tuning a pre-trained language model (e.g., BERT) using a large-scale corpus of Russian-language tables.

/ 100

Experimental

This project helps data professionals, data scientists, and analysts working with large volumes of Russian-language tables. It automatically identifies the type or category of data within each column (e.g., city, date, product ID). You input a CSV file containing your Russian table data, and it outputs labels for each column, telling you what kind of information it holds. This saves significant manual effort in data preparation and understanding.

No commits in the last 6 months.

Use this if you need to quickly and accurately understand the content and categorize columns within many Russian-language tables without manual inspection.

Not ideal if your tables are primarily in languages other than Russian, or if you require column type annotation for highly specialized, non-standard data types not typically found in general knowledge bases.

data-classification table-analysis data-tagging semantic-annotation Russian-data-processing

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 16 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

Tongjilibo/bert4torch

An elegent pytorch implement of transformers

nyu-mll/jiant

jiant is an nlp toolkit

lonePatient/TorchBlocks

A PyTorch-based toolkit for natural language processing

monologg/JointBERT

Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"

grammarly/gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite"...

Explore Transformer Models

All categories Trending Transformer directory Insights