xhluca/bm25s
Fast lexical search implementing BM25 in Python
This tool helps you quickly find relevant documents within a large collection of text, like articles, product descriptions, or internal knowledge bases, based on a search query. You provide a list of texts and a query, and it efficiently returns the most relevant documents. It's ideal for anyone who needs to build fast and accurate search functionality into their applications, such as data scientists, content managers, or e-commerce developers.
1,560 stars. Used by 12 other packages. Actively maintained with 7 commits in the last 30 days. Available on PyPI.
Use this if you need an extremely fast and efficient way to rank and retrieve text documents based on keywords and phrases, without complex machine learning models.
Not ideal if your search needs involve understanding nuanced semantic meaning, complex relationships between concepts, or searching across different languages without explicit translation.
Stars
1,560
Forks
93
Language
Python
License
MIT
Category
Last pushed
Mar 06, 2026
Commits (30d)
7
Dependencies
1
Reverse dependents
12
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/xhluca/bm25s"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
ALucek/QuicKB
Optimize Document Retrieval with Fine-Tuned KnowledgeBases
Rohith-2/bm25-fusion
An ultra-fast BM25 retriever with support for multiple variants and meta-data filtering.
analyticsinmotion/symrank
🐍📦 High-performance cosine similarity ranking for Retrieval-Augmented Generation (RAG) pipelines.
MukundaKatta/HybridFind
Hybrid semantic + keyword search — BM25 and vector similarity with Reciprocal Rank Fusion
JhaAyush01/Embedding-Quantization-for-Significantly-Faster-Cheaper-Retrieval
Unofficial Implementation of Binary and Scalar Embedding Quantization for Significantly Faster &...