oidlabs-com/Lexoid

Multimodal document parser for high quality data understanding and extraction

/ 100

Emerging

This tool helps you quickly extract high-quality text from various documents like PDFs or web pages, even complex ones, by leveraging advanced AI. You input a document, and it provides clean, structured text ready for analysis or further processing. Anyone who regularly needs to pull information from a large volume of documents, such as researchers, legal professionals, or data analysts, would find this useful.

Use this if you need to reliably convert diverse documents (PDFs, web pages) into well-structured text, especially when dealing with complex layouts or multi-modal content.

Not ideal if you only need basic text extraction from simple, text-only files or if you require fine-grained control over layout preservation for visual reproduction.

document-processing data-extraction content-analysis information-retrieval research-automation

No Package No Dependents

Maintenance 10 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

google/langextract

A Python library for extracting structured information from unstructured text using LLMs with...

Extralit/extralit

Fast and accurate systemic data extraction with LLM assistance

Keyvanhardani/german-ocr

German-OCR is specifically trained to extract text from German documents including invoices,...

xingbow/SciDaEx

Structured data extraction from research literature

parsee-ai/parsee-core

Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and...

Explore NLP Tools

All categories Trending NLP directory Insights