shibing624/similarities
Similarities: a toolkit for similarity calculation and semantic search. 相似度计算、匹配搜索工具包,支持亿级数据文搜文、文搜图、图搜图,python3开发,开箱即用。
This tool helps you find similar text or images quickly, even with huge amounts of data. You provide a piece of text or an image, and it returns the most closely matching items from your collection. It's ideal for anyone who needs to identify duplicates, group similar content, or power semantic search in applications.
899 stars. Available on PyPI.
Use this if you need to efficiently compare and find similarities between texts, images, or a mix of both for tasks like content matching, deduplication, or content recommendation.
Not ideal if your primary need is simple keyword matching rather than understanding the deeper meaning or visual content of your data.
Stars
899
Forks
90
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 05, 2026
Commits (30d)
0
Dependencies
7
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/shibing624/similarities"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
explosion/sense2vec
🦆 Contextually-keyed word vectors
chakki-works/chakin
Simple downloader for pre-trained word vectors
sebischair/Lbl2Vec
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with...
pdrm83/sent2vec
How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.
maxoodf/word2vec
word2vec++ is a Distributed Representations of Words (word2vec) library and tools...