lakinduboteju/langchain-pymupdf4llm
An integration package connecting PyMuPDF4LLM to LangChain
This project helps you turn complex PDF documents into structured Markdown text, making it easy to use their content with AI tools. You input a PDF file, and it outputs a clean Markdown version, complete with formatted text, tables, and even descriptions of images. This is ideal for anyone who needs to extract information from PDFs for use in large language models or AI-powered content generation.
Available on PyPI.
Use this if you need to reliably convert the content of PDF documents, including complex layouts and images, into a clean Markdown format for AI processing or knowledge bases.
Not ideal if your primary goal is simply viewing PDFs or if you do not plan to integrate the extracted text with AI models or advanced text processing.
Stars
16
Forks
1
Language
Python
License
MIT
Category
Last pushed
Nov 23, 2025
Commits (30d)
0
Dependencies
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/lakinduboteju/langchain-pymupdf4llm"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
eellak/glossAPI
Greek Dataset Production from PDF+
pymupdf/langchain-pymupdf4llm
An integration package connecting PyMuPDF4LLM to LangChain
KalyanM45/DocGenius-Revolutionizing-PDFs-with-AI
This is a Python application that allows you to load a PDF and ask questions about it using...
mozilla-ai/structured-qa
Blueprint by Mozilla.ai for answering questions about structured documents
alejandro-ao/langchain-ask-pdf
An AI-app that allows you to upload a PDF and ask questions about it. It uses OpenAI's LLMs to...