wisupai/e2m
E2M converts various file types (doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, m4a) into Markdown. It’s easy to install, with dedicated parsers and converters, supporting custom configs. E2M offers an all-in-one, flexible, and open-source solution.
This tool helps data scientists, AI engineers, and content managers prepare diverse content for advanced AI models. It takes various document types like PDFs, Word files, web pages, and audio recordings, extracts their content, and converts them into structured Markdown format. This process ensures high-quality data is available for training or fine-tuning AI for tasks like retrieval-augmented generation (RAG).
1,276 stars. No commits in the last 6 months.
Use this if you need to standardize and prepare a wide array of unstructured data, including documents, web content, and audio, into a clean Markdown format suitable for AI model training or RAG applications.
Not ideal if you only need to view or edit documents in their original format, or if your primary goal is simple, manual conversion for human readability without AI integration.
Stars
1,276
Forks
72
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Sep 08, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/wisupai/e2m"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
microsoft/markitdown
Python tool for converting files and office documents to Markdown.
doocs/md
✍ WeChat Markdown Editor | 一款高度简洁的微信 Markdown 编辑器:支持 Markdown 语法、自定义主题样式、内容管理、多图床、AI 助手等特性
AIDotNet/OpenDeepWiki
OpenDeepWiki is the open-source version of the DeepWiki project, aiming to provide a powerful...
hyperfield/ai-file-sorter
Cross-platform desktop application for content-aware file organization and renaming. Supports...
drl990114/MarkFlowy
The AI Markdown Editor