somewheresystems/dataclysm

Pull high-quality, efficient embeddings for PubMed, arXiv and Wikipedia from Huggingface and use for local LLM inference/Retrieval Augmented Generation (RAG)

29
/ 100
Experimental

This tool helps researchers and knowledge workers explore vast scientific and general knowledge databases like PubMed, arXiv, and Wikipedia. You provide a search query, and it returns highly relevant articles and summaries. It's designed for anyone needing to quickly find and understand information from large academic or informational text collections.

No commits in the last 6 months.

Use this if you need to efficiently search and summarize information across millions of academic papers or Wikipedia articles.

Not ideal if you are looking to analyze very short texts or data outside of research papers and general encyclopedic content.

scientific-research literature-review knowledge-discovery information-retrieval academic-search
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 5 / 25

How are scores calculated?

Stars

47

Forks

2

Language

Jupyter Notebook

License

Apache-2.0

Last pushed

Feb 16, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/somewheresystems/dataclysm"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.