shibing624/similarities

Similarities: a toolkit for similarity calculation and semantic search. 相似度计算、匹配搜索工具包，支持亿级数据文搜文、文搜图、图搜图，python3开发，开箱即用。

/ 100

Established

This tool helps you find similar text or images quickly, even with huge amounts of data. You provide a piece of text or an image, and it returns the most closely matching items from your collection. It's ideal for anyone who needs to identify duplicates, group similar content, or power semantic search in applications.

899 stars. Available on PyPI.

Use this if you need to efficiently compare and find similarities between texts, images, or a mix of both for tasks like content matching, deduplication, or content recommendation.

Not ideal if your primary need is simple keyword matching rather than understanding the deeper meaning or visual content of your data.

content-management information-retrieval digital-asset-management e-commerce qa-systems

Maintenance 10 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 19 / 25

How are scores calculated?

Stars

899

Forks

Language

Python

License

Apache-2.0

Related tools

explosion/sense2vec

🦆 Contextually-keyed word vectors

chakki-works/chakin

Simple downloader for pre-trained word vectors

sebischair/Lbl2Vec

Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with...

pdrm83/sent2vec

How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.

maxoodf/word2vec

word2vec++ is a Distributed Representations of Words (word2vec) library and tools...

Explore Embedding Tools

All categories Trending Embeddings directory Insights