natasha/razdel

Rule-based token, sentence segmentation for Russian language

56
/ 100
Established

This tool helps anyone working with Russian text to break down sentences into individual words or punctuation marks, and longer texts into separate sentences. You provide raw Russian text, and it returns a list of its constituent parts. It's ideal for linguists, researchers, or data analysts processing large volumes of Russian language content.

279 stars. Used by 4 other packages. No commits in the last 6 months. Available on PyPI.

Use this if you need to accurately split Russian news articles, fiction, or similar formal texts into words and sentences for further analysis.

Not ideal if your Russian text comes from social media, scientific papers, or legal documents, as its rules are optimized for news and fiction.

Russian-language-processing text-analysis linguistics data-preparation NLP
Stale 6m
Maintenance 0 / 25
Adoption 14 / 25
Maturity 25 / 25
Community 17 / 25

How are scores calculated?

Stars

279

Forks

34

Language

Python

License

MIT

Last pushed

Jul 24, 2023

Commits (30d)

0

Reverse dependents

4

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/natasha/razdel"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.