stdlib-js/datasets-cmudict
The Carnegie Mellon Pronouncing Dictionary (CMUdict).
This provides a comprehensive digital dictionary for North American English pronunciations. It takes in English words or punctuation marks and outputs their phonetic spellings using the ARPAbet system, along with information about individual phonetic sounds. Speech recognition engineers, linguists, and computational phoneticians would find this valuable for developing applications or conducting research related to spoken language.
Use this if you need a machine-readable dictionary of English word pronunciations and phonetic details for North American English.
Not ideal if you need a pronunciation dictionary for languages other than North American English or require a different phonetic transcription system.
Stars
16
Forks
1
Language
JavaScript
License
—
Category
Last pushed
Mar 16, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/stdlib-js/datasets-cmudict"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Alir3z4/python-stop-words
Get list of common stop words in various languages in Python
hklemp/dotnet-stop-words
Get list of common stop words in various languages in dotnet
skupriienko/Ukrainian-Stopwords
the list of ~2000 ukrainian stopwords (with numbers)
igorbrigadir/stopwords
Default English stopword lists from many different sources
stdlib-js/datasets-savoy-stopwords-fr
A list of French stop words.