iamarunbrahma/rag-ingest

RAG-Ingest: A tool for converting PDFs to markdown and indexing them for enhanced Retrieval Augmented Generation (RAG) capabilities.

27
/ 100
Experimental

This tool helps researchers, analysts, and content managers transform complex PDF documents into structured markdown. It accurately extracts text, images, tables, and even code blocks while preserving layout. The output is then organized and indexed, making it easy to find specific information within your documents using natural language queries.

No commits in the last 6 months.

Use this if you need to extract detailed content from many PDFs and make that information readily searchable and usable for AI-powered applications, like building a custom chatbot that answers questions from your reports.

Not ideal if you only need simple text extraction or if your documents are primarily scanned images without selectable text.

document-management research-analysis information-retrieval content-extraction data-preparation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 6 / 25

How are scores calculated?

Stars

13

Forks

1

Language

Python

License

MIT

Last pushed

Nov 22, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/iamarunbrahma/rag-ingest"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.