NickZaitsev/ru-normalizr
ru-normalizr — лучший open-source нормализатор русского текста. Приводит числа, даты, время, сокращения, римские цифры, символы и латиницу в русские буквы для использования в TTS и NLP.
This tool helps transform written Russian text containing numbers, dates, times, abbreviations, Roman numerals, symbols, and Latin characters into fully spelled-out Russian words. It's designed for professionals working with speech synthesis (TTS) or natural language processing (NLP) who need text to be pronounced correctly or analyzed consistently. The input is raw Russian text, and the output is the same text with all non-standard elements converted to their spoken or canonical Russian word forms.
Available on PyPI.
Use this if you need to prepare Russian text for a text-to-speech system or for linguistic analysis, ensuring that numbers, acronyms, and foreign words are correctly rendered into spoken Russian.
Not ideal if you need a tool that handles stress marks for all Russian words, as it only adds them during Latin-to-Cyrillic conversion.
Stars
8
Forks
1
Language
Python
License
MIT
Category
Last pushed
Mar 16, 2026
Commits (30d)
0
Dependencies
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/NickZaitsev/ru-normalizr"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
speechio/chinese_text_normalization
Chinese text normalization for speech processing
gladiaio/normalization
A lightweight library for normalizing speech transcripts before computing WER
34j/mecab-text-cleaner
Simple Python package (CLI/Python API) for getting japanese readings (yomigana) and accents using MeCab.
repodiac/german_transliterate
Python module to clean and transliterate (i.e. normalize) German text including abbreviations,...
google-research-datasets/TextNormalizationCoveringGrammars
Covering grammars for English and Russian text normalization