catastropiyush/RAG-dataset-gen
Retrieval augmented generation for building datasets from scientific literature: Contains the notebooks used for creating datasets
This project helps materials scientists and researchers extract specific material properties from large volumes of scientific literature, such as research paper abstracts. You provide a collection of scientific abstracts and a specific query for the data you need, and it outputs a structured dataset (like an Excel file) containing the extracted information, such as hydrogen storage capacity, temperature, and pressure for various alloys. This is for scientists or engineers who need to quickly compile structured data from unstructured text.
No commits in the last 6 months.
Use this if you need to systematically extract specific, quantitative data points about materials or their properties from a large body of scientific text, turning unstructured information into a usable dataset for analysis.
Not ideal if you are looking for a general-purpose scientific text summarizer or a tool to analyze broad themes in literature rather than extracting precise, structured parameters.
Stars
9
Forks
5
Language
Jupyter Notebook
License
—
Category
Last pushed
Jun 30, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/catastropiyush/RAG-dataset-gen"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Renumics/renumics-rag
Visualization for a Retrieval-Augmented Generation (RAG) Assistant 🤖❤️📚
VectorInstitute/retrieval-augmented-generation
Reference Implementations for the RAG bootcamp
naver/bergen
Benchmarking library for RAG
KalyanKS-NLP/rag-zero-to-hero-guide
Comprehensive guide to learn RAG from basics to advanced.
alan-turing-institute/t0-1
Application of Retrieval-Augmented Reasoning on a domain-specific body of knowledge