Text Normalization Engines Voice AI Tools

Tools for normalizing written text into spoken forms across languages, handling numbers, dates, abbreviations, and special characters for TTS and speech processing. Does NOT include general text-to-speech synthesis, speech recognition, or audio processing.

There are 17 text normalization engines tools tracked. 1 score above 50 (established tier). The highest-rated is speechio/chinese_text_normalization at 51/100 with 722 stars.

Get all 17 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=voice-ai&subcategory=text-normalization-engines&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 speechio/chinese_text_normalization

Chinese text normalization for speech processing

51
Established
2 NickZaitsev/ru-normalizr

ru-normalizr — лучший open-source нормализатор русского текста. Приводит...

45
Emerging
3 gladiaio/normalization

A lightweight library for normalizing speech transcripts before computing WER

43
Emerging
4 34j/mecab-text-cleaner

Simple Python package (CLI/Python API) for getting japanese readings...

42
Emerging
5 repodiac/german_transliterate

Python module to clean and transliterate (i.e. normalize) German text...

42
Emerging
6 google-research-datasets/TextNormalizationCoveringGrammars

Covering grammars for English and Russian text normalization

39
Emerging
7 ducnt18121997/Viet-Text-Normalization

A Python library for text normalization, specifically designed for...

36
Emerging
8 ScottishFold007/TTSAudioNormalizer

TTSAudioNormalizer is a specialized tool for TTS data production,...

32
Emerging
9 tomaarsen/TTSTextNormalization

Convert English text from written expressions into spoken forms

32
Emerging
10 stefantaubert/english-text-normalization

Command-line interface (CLI) and library to normalize English texts.

31
Emerging
11 Agash/TTSTextNormalization

Modern .NET10 / C#14 library to normalize text (emojis, currency, numbers,...

29
Experimental
12 NetherQuartz/TextForSpeechNormalizer

A Python library to accentuate Russian text

28
Experimental
13 cewarman/NTPU_online_text_normalization

An online text normalization tool for Chinese-English mixed text-to-speech system

25
Experimental
14 seanghay/khmertagger

KhmerTagger: Inverse Text Normalization for Khmer Automatic Speech Recognition

22
Experimental
15 Amir79Naziri/TextNormalization_Project

Implementing text normalization for Farsi(Persian) language.

22
Experimental
16 rafalposwiata/text-normalization

Repository for text normalization research.

22
Experimental
17 bmwasaru/kiswahili-speech-normalization

Kiswahili text normalization utilities for speech datasets (ASR/TTS)

14
Experimental