seanpedrick-case/doc_redaction
Redact PDF/image-based documents, Word, or CSV/XLSX files using a graphical user interface. Demo: https://huggingface.co/spaces/seanpedrickcase/document_redaction or with try with VLMs: https://huggingface.co/spaces/seanpedrickcase/document_redaction_vlm
This tool helps legal, HR, or administrative professionals safely share sensitive documents by redacting personal identifying information (PII). You input PDFs, Word documents, or Excel/CSV files, and it outputs a version where names, addresses, and other private data are removed. It's designed for anyone needing to anonymize documents before wider distribution or archival.
Use this if you need a graphical interface to quickly and accurately remove sensitive information from various document types for privacy or compliance.
Not ideal if you require 100% automated, error-free redaction without any human review, as all outputs still need a final check.
Stars
42
Forks
8
Language
Python
License
—
Category
Last pushed
Mar 18, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/seanpedrick-case/doc_redaction"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
DataFog/datafog-python
Python SDK for PII detection and redaction in text and images, combining regex + NLP pipelines...
vmenger/deduce
Deduce: de-identification method for Dutch medical text
aphp/eds-pseudo
EDS-Pseudo is a hybrid model for detecting personally identifying entities in clinical reports
martincjespersen/DaAnonymization
Simple customizable pipeline tool for anonymizing Danish text.
thoughtbot/top_secret
Filter sensitive information from free text before sending it to external services or APIs, such...