google/langextract

A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

69
/ 100
Established

This tool helps non-technical professionals like researchers or analysts to quickly pull specific, structured facts from large amounts of unstructured text, such as clinical notes, reports, or literary works. You provide raw text and define what information you're looking for (e.g., characters, medications, relationships), and it outputs an organized list of those extracted details, complete with their exact location in the original document and an interactive visualization. This is ideal for anyone needing to systematically find and verify specific data points across many documents without manual review.

34,668 stars. Actively maintained with 11 commits in the last 30 days. Available on PyPI.

Use this if you need to extract specific types of information from large volumes of text documents and want to ensure the extracted data is directly traceable back to its source.

Not ideal if your task requires summarizing or generating new text rather than strictly extracting existing facts, or if you don't need to verify extractions against their original context.

information-extraction clinical-data-analysis document-processing research-analysis qualitative-data
Maintenance 17 / 25
Adoption 10 / 25
Maturity 24 / 25
Community 18 / 25

How are scores calculated?

Stars

34,668

Forks

2,330

Language

Python

License

Apache-2.0

Last pushed

Feb 25, 2026

Commits (30d)

11

Dependencies

17

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/google/langextract"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.