HazyResearch/fonduer

A knowledge base construction engine for richly formatted data

57
/ 100
Established

This tool helps you automatically extract specific pieces of information and relationships from complex documents like hardware datasheets or scientific papers. You feed it your richly formatted documents, and it outputs a structured knowledge base containing the facts and connections you're looking for. It's ideal for researchers, engineers, or data managers who need to systematically organize data from diverse, non-standard document types.

412 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to build a structured database of facts and relationships from a large collection of richly formatted, unstructured documents like tables, lists, and text.

Not ideal if your data is already highly structured or if you only need to extract information from plain text without complex formatting.

data-extraction technical-document-analysis knowledge-management information-retrieval research-data-organization
Stale 6m
Maintenance 0 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 22 / 25

How are scores calculated?

Stars

412

Forks

77

Language

Python

License

MIT

Last pushed

Jun 23, 2021

Commits (30d)

0

Dependencies

17

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/HazyResearch/fonduer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.