katanaml/sparrow
Structured data extraction and instruction calling with ML, LLM and Vision LLM
This tool helps businesses and individuals convert various documents like invoices, receipts, bank statements, and forms into organized, structured data. You input an image or PDF document, and it outputs the extracted information in a clean, queryable JSON format. It's designed for anyone who regularly deals with processing physical or digital documents and needs to quickly pull out specific pieces of information.
5,129 stars. Actively maintained with 15 commits in the last 30 days.
Use this if you need to automate the process of extracting specific data points from a high volume of diverse documents like financial statements or forms into a structured, digital format.
Not ideal if you only occasionally process a few simple documents by hand, or if your primary need is general text summarization rather than precise data extraction.
Stars
5,129
Forks
511
Language
Python
License
GPL-3.0
Category
Last pushed
Mar 12, 2026
Commits (30d)
15
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/katanaml/sparrow"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Community Discussion
Recent Releases
Related tools
WangRongsheng/awesome-LLM-resources
🧑🚀 全世界最好的LLM资料总结(多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the...
SylphAI-Inc/AdalFlow
AdalFlow: The library to build & auto-optimize LLM applications.
LazyAGI/LazyLLM
Easiest and laziest way for building multi-agent LLMs applications.
luhengshiwo/LLMForEverybody
每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈
PacktPublishing/LLM-Engineers-Handbook
The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS...