thoughtbot/top_secret

Filter sensitive information from free text before sending it to external services or APIs, such as chatbots and LLMs.

/ 100

Emerging

This helps operations engineers, data privacy officers, or anyone managing customer interactions to automatically remove sensitive personal data from free text before it's sent to external tools like chatbots or AI models. You provide raw text that might contain things like credit card numbers, emails, phone numbers, or names, and it outputs a version of that text with the sensitive details replaced by placeholders. This ensures compliance and protects user privacy when interacting with third-party services.

327 stars.

Use this if you need to automatically sanitize free-form text inputs to protect personal identifiable information (PII) before it leaves your system.

Not ideal if you need a solution for structured data redaction, or if you require an extremely high-performance solution for massive, real-time data streams where model loading time is a critical concern.

data-privacy text-sanitization compliance personally-identifiable-information natural-language-processing

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 15 / 25

Community 8 / 25

How are scores calculated?

Stars

327

Forks

Language

Ruby

License

MIT

Higher-rated alternatives

DataFog/datafog-python

Python SDK for PII detection and redaction in text and images, combining regex + NLP pipelines...

vmenger/deduce

Deduce: de-identification method for Dutch medical text

aphp/eds-pseudo

EDS-Pseudo is a hybrid model for detecting personally identifying entities in clinical reports

seanpedrick-case/doc_redaction

Redact PDF/image-based documents, Word, or CSV/XLSX files using a graphical user interface....

martincjespersen/DaAnonymization

Simple customizable pipeline tool for anonymizing Danish text.

Explore NLP Tools

All categories Trending NLP directory Insights