TimSchopf/KeyphraseVectorizers
Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a document-keyphrase matrix.
This tool helps you analyze collections of text documents to identify important multi-word phrases. It takes your raw text documents as input and outputs a structured table showing which keyphrases appear in each document, along with their frequencies. Anyone needing to understand the core topics across many texts, such as researchers analyzing papers or marketers reviewing customer feedback, would find this useful.
267 stars. No commits in the last 6 months.
Use this if you need to automatically extract grammatically correct, multi-word keyphrases from documents and quantify their presence.
Not ideal if you're looking for single keywords or if your main goal is simply to count individual words.
Stars
267
Forks
38
Language
Python
License
BSD-3-Clause
Category
Last pushed
Nov 08, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/TimSchopf/KeyphraseVectorizers"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
chakki-works/seqeval
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
Hironsan/anago
Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.
jbesomi/texthero
Text preprocessing, representation and visualization from zero to hero.
hamelsmu/ktext
Utilities for preprocessing text for deep learning with Keras
asahi417/tner
Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An...