houbb/sensitive-word
👮♂️The sensitive word tool for java.(敏感词/违禁词/违法词/脏词。基于 DFA 算法实现的高性能 java 敏感词过滤工具框架。内置支持单词标签分类分级。请勿发布涉及政治、广告、营销、翻墙、违反国家法律法规等内容。高性能敏感词检测过滤组件,附带繁体简体互换,支持全角半角互换,汉字转拼音,模糊搜索等功能。)
This tool helps businesses and platforms automatically detect and filter out inappropriate or forbidden language from user-generated content, comments, or documents. It takes raw text as input and outputs either a confirmation that no sensitive words were found, a list of identified sensitive words, or a 'cleaned' version of the text with sensitive words replaced. Content moderators, community managers, and platform administrators can use this to maintain a safe and compliant online environment.
5,738 stars. Actively maintained with 2 commits in the last 30 days.
Use this if you need a high-performance solution to automatically identify and manage sensitive, illegal, or offensive words in a large volume of text, supporting custom replacement rules and word categories.
Not ideal if your primary need is general text analysis or natural language processing tasks beyond simply identifying and filtering predefined sensitive terms.
Stars
5,738
Forks
781
Language
Java
License
Apache-2.0
Category
Last pushed
Dec 24, 2025
Commits (30d)
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/houbb/sensitive-word"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
NateScarlet/holiday-cn
📅🇨🇳中国法定节假日数据 自动每日抓取国务院公告
sagorbrur/bnlp
BNLP is a natural language processing toolkit for Bengali Language.
brightmart/nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese...
thunlp/THUOCL
THUOCL(THU Open Chinese Lexicon)中文词库