oidlabs-com/Lexoid
Multimodal document parser for high quality data understanding and extraction
This tool helps you quickly extract high-quality text from various documents like PDFs or web pages, even complex ones, by leveraging advanced AI. You input a document, and it provides clean, structured text ready for analysis or further processing. Anyone who regularly needs to pull information from a large volume of documents, such as researchers, legal professionals, or data analysts, would find this useful.
Use this if you need to reliably convert diverse documents (PDFs, web pages) into well-structured text, especially when dealing with complex layouts or multi-modal content.
Not ideal if you only need basic text extraction from simple, text-only files or if you require fine-grained control over layout preservation for visual reproduction.
Stars
96
Forks
11
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 12, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/oidlabs-com/Lexoid"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
google/langextract
A Python library for extracting structured information from unstructured text using LLMs with...
Extralit/extralit
Fast and accurate systemic data extraction with LLM assistance
Keyvanhardani/german-ocr
German-OCR is specifically trained to extract text from German documents including invoices,...
xingbow/SciDaEx
Structured data extraction from research literature
parsee-ai/parsee-core
Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and...