gandersen101/spaczz

Fuzzy matching and more functionality for spaCy.

/ 100

Established

This tool helps developers working with natural language processsing to identify specific words or phrases in text, even if there are slight misspellings or variations. It takes raw text as input and uses predefined patterns to find and extract matching phrases, along with a score indicating how closely they match. It's designed for Python developers who build applications that process and understand human language.

258 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to reliably find and extract text patterns in documents where perfect spelling or exact phrasing cannot be guaranteed, such as user-generated content or scanned documents.

Not ideal if you require extremely high performance for very large datasets, as the fuzzy matching process can be computationally intensive compared to exact string matching.

natural-language-processing text-extraction information-retrieval data-cleaning text-mining

Stale 6m

Maintenance 0 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 16 / 25

How are scores calculated?

Stars

258

Forks

Language

Python

License

MIT

Related tools

nltk/nltk

NLTK Source

explosion/spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

undertheseanlp/underthesea

Underthesea - Vietnamese NLP Toolkit

stanfordnlp/stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many...

flairNLP/flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Explore NLP Tools

All categories Trending NLP directory Insights