gandersen101/spaczz

Fuzzy matching and more functionality for spaCy.

51
/ 100
Established

This tool helps developers working with natural language processsing to identify specific words or phrases in text, even if there are slight misspellings or variations. It takes raw text as input and uses predefined patterns to find and extract matching phrases, along with a score indicating how closely they match. It's designed for Python developers who build applications that process and understand human language.

258 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to reliably find and extract text patterns in documents where perfect spelling or exact phrasing cannot be guaranteed, such as user-generated content or scanned documents.

Not ideal if you require extremely high performance for very large datasets, as the fuzzy matching process can be computationally intensive compared to exact string matching.

natural-language-processing text-extraction information-retrieval data-cleaning text-mining
Stale 6m
Maintenance 0 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 16 / 25

How are scores calculated?

Stars

258

Forks

30

Language

Python

License

MIT

Last pushed

Jul 06, 2024

Commits (30d)

0

Dependencies

7

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/gandersen101/spaczz"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.