rag-from-scratch and SRAG

These are complements: the educational framework for understanding RAG fundamentals pairs naturally with an advanced production system that implements those concepts at scale with specialized capabilities like audio processing and sophisticated document parsing.

rag-from-scratch
56
Established
SRAG
40
Emerging
Maintenance 13/25
Adoption 10/25
Maturity 13/25
Community 20/25
Maintenance 10/25
Adoption 4/25
Maturity 13/25
Community 13/25
Stars: 1,239
Forks: 135
Downloads:
Commits (30d): 3
Language: JavaScript
License: MIT
Stars: 5
Forks: 2
Downloads:
Commits (30d): 0
Language: Scala
License: GPL-3.0
No Package No Dependents
No Package No Dependents

About rag-from-scratch

pguso/rag-from-scratch

Demystify RAG by building it from scratch. Local LLMs, no black boxes - real understanding of embeddings, vector search, retrieval, and context-augmented generation.

This project helps software developers understand and implement Retrieval-Augmented Generation (RAG) systems. It breaks down the process of turning unstructured text documents into numerical representations, storing them efficiently, and then using a query to retrieve the most relevant information. Developers can use this to build applications that provide highly accurate, context-aware answers from custom knowledge bases using local language models, rather than relying on external APIs.

software-development AI-engineering natural-language-processing knowledge-retrieval local-LLMs

About SRAG

CyrilDesch/SRAG

Highly flexible RAG system with advanced document parsing and audio processing.

Implements hybrid retrieval combining vector search, BM25 lexical search, and cross-encoder reranking with metadata filtering. Built on Scala 3 and ZIO with hexagonal architecture, it decouples domain logic from pluggable adapters (PostgreSQL, Qdrant, OpenSearch, Whisper, MinIO, Redis) configurable entirely via environment variables for infrastructure-agnostic deployment. Designed for stateless horizontal scaling in production rather than single-machine experimentation, with built-in audio transcription and vision-based document extraction pipelines.

Scores updated daily from GitHub, PyPI, and npm data. How scores work