CUNY-CL/wikipron
Massively multilingual pronunciation mining
WikiPron helps linguists and researchers easily collect pronunciation data from Wiktionary. It takes a language's ISO code and desired dialects, then outputs a list of words with their corresponding phonetic transcriptions in the International Phonetic Alphabet (IPA). This tool is for anyone building language resources or conducting linguistic analysis.
363 stars. Available on PyPI.
Use this if you need to gather extensive word-pronunciation pairs for many languages to support research, build language learning tools, or develop speech technologies.
Not ideal if you need human-verified pronunciation data or are working with languages not extensively covered in Wiktionary.
Stars
363
Forks
77
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 03, 2026
Commits (30d)
0
Dependencies
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/CUNY-CL/wikipron"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
isaacus-dev/semchunk
A fast, lightweight and easy-to-use Python library for splitting text into semantically...
chatopera/Synonyms
:herb: 中文近义词:聊天机器人,智能问答工具包
jacksonllee/pylangacq
Language Acquisition Research Tools
goodmami/wn
A modern, interlingual wordnet interface for Python
UCREL/pymusas
Python Multilingual Ucrel Semantic Analysis System