eklem/stopword-trainer

A module for creating stopword lists for any language, based on a set of documents.

41
/ 100
Emerging

This tool helps anyone working with text data to create customized lists of 'stopwords' – common words like 'the,' 'a,' or 'is' that are often removed before analysis. You provide a collection of documents, and it generates a stopword list tailored to that specific content or language. This is useful for data scientists, linguists, or anyone preparing text for machine learning, search, or content analysis.

Available on npm.

Use this if you need to build highly relevant stopword lists for a specific domain, language, or evolving content, rather than relying on generic, pre-defined lists.

Not ideal if you simply need a standard, off-the-shelf stopword list for a common language without any customization.

natural-language-processing text-analysis information-retrieval content-analysis linguistics
No Dependents
Maintenance 10 / 25
Adoption 6 / 25
Maturity 25 / 25
Community 0 / 25

How are scores calculated?

Stars

15

Forks

Language

JavaScript

License

MIT

Last pushed

Feb 28, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/eklem/stopword-trainer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.