kostadindev/knowledge-base-builder
Python package that constructs a structured markdown knowledge base from external sources such as PDFs, websites, and GitHub repos with LLM summarization. Ideal for RAG, search-friendly LLM contexts (/llms.txt), and chatbots.
This tool helps you quickly gather and organize information from many different places like websites, PDFs, GitHub repositories, and even YouTube videos. It takes all that varied input and creates a single, structured Markdown document or a specialized context file, making it easy to build domain-specific chatbots or prepare information for advanced search systems. Marketing analysts, researchers, or operations managers can use this to create comprehensive knowledge bases from their disparate data sources.
No commits in the last 6 months. Available on PyPI.
Use this if you need to consolidate information from multiple, diverse sources into one organized, easy-to-use document for building chatbots, enhancing search, or generally making sense of large content collections.
Not ideal if you need to perform deep, analytical querying on structured datasets or if your primary goal is real-time data streaming and processing.
Stars
8
Forks
2
Language
Python
License
MIT
Category
Last pushed
Jun 16, 2025
Commits (30d)
0
Dependencies
16
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/kostadindev/knowledge-base-builder"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ItzCrazyKns/Vane
Vane is an AI-powered answering engine.
ConardLi/easy-dataset
A powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval
xuwei95/ezdata
基于python和llm大模型开发的数据处理和任务调度系统。...
ModelEngine-Group/DataMate
DataMate is an enterprise-level data processing platform designed for model fine-tuning and RAG...
DS4SD/deepsearch-toolkit
Interact with the Deep Search platform for new knowledge explorations and discoveries