michaelmml/NLP-Information-Extraction

Automated PDF and text processing with Spacy and NLTK; information extraction from text based on grammatical structure; deployed on extracted raw search data

/ 100

Experimental

This tool helps researchers, analysts, or business intelligence professionals automatically process large volumes of text, such as company transcripts, patent documents, or news articles. It takes raw text or PDFs as input and extracts key information like topics, keywords, named entities (like company names), and significant phrases. The output helps you quickly understand content, identify trends, and summarize lengthy documents without manual review.

No commits in the last 6 months.

Use this if you need to quickly extract structured insights and key information from large unstructured text datasets like financial reports, legal documents, or industry news.

Not ideal if you need to perform sentiment analysis, question-answering, or generate new text rather than extract existing information.

market-research patent-analysis business-intelligence financial-analysis competitive-intelligence

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 5 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

ziqizhang/jate

JATE - Just Automatic Term Extraction (in Python)

mcs07/ChemDataExtractor

Automatically extract chemical information from scientific documents

brucewlee/lftk

[BEA @ ACL 2023] General-purpose tool for linguistic features extraction; Tested on readability...

mmmaurer/elfen

A python package to efficiently extract linguistic features for text/NLP datasets

strangetom/ingredient-parser

A tool to parse recipe ingredients into structured data

Explore NLP Tools

All categories Trending NLP directory Insights