velocitybolt/open-extract
Structured Data Extractor for AI Agents. Search your documents or the web for specific data and get it back in JSON or Markdown in a single tool call.
This tool helps AI agent developers extract specific pieces of information from large documents or websites. You tell it what data you need using a predefined structure (a schema), and it returns that data in an easy-to-use JSON or Markdown format. This is ideal for those building AI-powered applications that need to understand and process unstructured text.
185 stars.
Use this if you are building AI agents or automated workflows and need to consistently pull structured data like financial metrics, customer feedback categories, or legal terms from various documents and web pages without writing complex parsing logic.
Not ideal if you are looking for a simple document search tool or a general-purpose AI chatbot, as its primary function is structured data extraction for AI agentic workflows.
Stars
185
Forks
21
Language
Python
License
MIT
Category
Last pushed
Jan 05, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/velocitybolt/open-extract"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
kreuzberg-dev/kreuzberg
A polyglot document intelligence framework with a Rust core. Extract text, metadata, and...
PaddlePaddle/PaddleOCR
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR...
yfedoseev/pdf_oxide
The fastest PDF library for Python and Rust. Text extraction, image extraction, markdown...
opendataloader-project/opendataloader-pdf
PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
AKSarav/pdfstract
PDFStract - The Extraction and Chunking Layer in Your RAG Pipeline - Available as CLI - WEBUI - API