martincjespersen/DaAnonymization
Simple customizable pipeline tool for anonymizing Danish text.
This tool helps you quickly remove sensitive personal information from Danish text documents. You provide a collection of texts, and it automatically identifies and replaces names, locations, organizations, CPR numbers, phone numbers, and email addresses with placeholders. It's designed for anyone working with Danish textual data who needs to comply with privacy regulations or protect personal data.
No commits in the last 6 months. Available on PyPI.
Use this if you need to anonymize Danish texts for privacy, data sharing, or research purposes, ensuring personal details are masked.
Not ideal if you require a 100% guarantee that every single piece of sensitive information has been removed, as predictive models can have limitations.
Stars
11
Forks
4
Language
Python
License
Apache-2.0
Category
Last pushed
Sep 19, 2024
Commits (30d)
0
Dependencies
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/martincjespersen/DaAnonymization"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
DataFog/datafog-python
Python SDK for PII detection and redaction in text and images, combining regex + NLP pipelines...
vmenger/deduce
Deduce: de-identification method for Dutch medical text
aphp/eds-pseudo
EDS-Pseudo is a hybrid model for detecting personally identifying entities in clinical reports
seanpedrick-case/doc_redaction
Redact PDF/image-based documents, Word, or CSV/XLSX files using a graphical user interface....
thoughtbot/top_secret
Filter sensitive information from free text before sending it to external services or APIs, such...