KylinMountain/markify
Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.
Markify helps you convert various files like PDFs, Word documents, images, and even web pages into clean Markdown format. This makes it much easier for large language models (LLMs) and retrieval-augmented generation (RAG) systems to understand and process your content. It takes your existing documents and outputs a structured Markdown version, ideal for data engineers or AI trainers preparing datasets for advanced AI applications.
133 stars. No commits in the last 6 months.
Use this if you need to standardize diverse document types into Markdown to improve the performance of your AI models or knowledge bases.
Not ideal if you primarily need to convert documents for simple viewing or editing in a traditional word processor, as the output is specifically structured for machine readability.
Stars
133
Forks
16
Language
Python
License
—
Category
Last pushed
Mar 27, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/KylinMountain/markify"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
any4ai/AnyCrawl
AnyCrawl π: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts...
kreuzberg-dev/html-to-markdown
High performance and CommonMark compliant HTML to Markdown converter. Maintained by the...
ScrapeGraphAI/Scrapegraph-ai
Python scraper based on AI
adbar/trafilatura
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping,...
paulpierre/markdown-crawler
A multithreaded πΈοΈ web crawler that recursively crawls a website and creates a π½ markdown file...