malvads/mojo

Non sucking cross-platform extremely fast C++ crawler to convert entire websites into LLM readable data

32
/ 100
Emerging

Mojo helps AI practitioners and data engineers quickly gather vast amounts of web content to train their AI models or build knowledge bases. It takes a list of website URLs and automatically converts them into clean, structured Markdown files, ready for ingestion by large language models (LLMs) or retrieval-augmented generation (RAG) systems. This is ideal for anyone who needs high-quality, organized web data for AI applications.

Use this if you need to rapidly collect and clean large datasets from websites for AI model training or to power AI-driven knowledge bases.

Not ideal if you're looking for a general-purpose web scraper for personal use or for extracting highly specific data points from just a few pages.

AI-data-ingestion LLM-data-preparation web-content-gathering knowledge-base-building RAG-system-development
No Package No Dependents
Maintenance 10 / 25
Adoption 5 / 25
Maturity 11 / 25
Community 6 / 25

How are scores calculated?

Stars

12

Forks

1

Language

C++

License

MIT

Last pushed

Feb 04, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/malvads/mojo"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.