xhluca/bm25s

Fast lexical search implementing BM25 in Python

/ 100

Verified

This tool helps you quickly find relevant documents within a large collection of text, like articles, product descriptions, or internal knowledge bases, based on a search query. You provide a list of texts and a query, and it efficiently returns the most relevant documents. It's ideal for anyone who needs to build fast and accurate search functionality into their applications, such as data scientists, content managers, or e-commerce developers.

1,560 stars. Used by 12 other packages. Actively maintained with 7 commits in the last 30 days. Available on PyPI.

Use this if you need an extremely fast and efficient way to rank and retrieve text documents based on keywords and phrases, without complex machine learning models.

Not ideal if your search needs involve understanding nuanced semantic meaning, complex relationships between concepts, or searching across different languages without explicit translation.

information-retrieval document-search text-ranking knowledge-base content-discovery

Maintenance 17 / 25

Adoption 15 / 25

Maturity 25 / 25

Community 17 / 25

How are scores calculated?

Stars

1,560

Forks

Language

Python

License

MIT

Featured in

Embeddings Are Easier Than Whatever You're Doing Instead

Compare

bm25s and bm25-fusion

Related tools

ALucek/QuicKB

Optimize Document Retrieval with Fine-Tuned KnowledgeBases

Rohith-2/bm25-fusion

An ultra-fast BM25 retriever with support for multiple variants and meta-data filtering.

analyticsinmotion/symrank

🐍📦 High-performance cosine similarity ranking for Retrieval-Augmented Generation (RAG) pipelines.

MukundaKatta/HybridFind

Hybrid semantic + keyword search — BM25 and vector similarity with Reciprocal Rank Fusion

JhaAyush01/Embedding-Quantization-for-Significantly-Faster-Cheaper-Retrieval

Unofficial Implementation of Binary and Scalar Embedding Quantization for Significantly Faster &...

Explore Embedding Tools

All categories Trending Embeddings directory Insights