huggingface/text-embeddings-inference
A blazing fast inference solution for text embeddings models
This solution helps machine learning engineers and data scientists deploy text embedding and sequence classification models for their applications. You input raw text, and it quickly outputs numerical representations (embeddings) or classification labels for use in search, recommendation, or sentiment analysis systems. It's designed for those who need to serve large volumes of text processing requests efficiently.
4,582 stars. Actively maintained with 10 commits in the last 30 days.
Use this if you need to serve text embedding or classification models at high speed and scale, minimizing latency and resource usage.
Not ideal if you are looking for a pre-built application that directly performs tasks like document search or sentiment analysis without needing to deploy models yourself.
Stars
4,582
Forks
370
Language
Rust
License
Apache-2.0
Category
Last pushed
Mar 12, 2026
Commits (30d)
10
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/huggingface/text-embeddings-inference"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
Anush008/fastembed-rs
Rust library for vector embeddings and reranking.
MinishLab/model2vec-rs
Official Rust Implementation of Model2Vec
finalfusion/finalfusion-rust
finalfusion embeddings in Rust
finalfusion/finalfusion-python
Finalfusion embeddings in Python
olafurjohannsson/kjarni
Native ML inference engine — embeddings, classification, reranking, search, and text generation....