kozistr/triton-grpc-proxy-rs
Proxy server for triton gRPC server that inferences embedding model in Rust
This project helps developers serve machine learning models that generate embeddings from text. It takes raw text inputs and outputs numerical vector representations (embeddings) that can be used for tasks like search, recommendation, or classification. It's designed for developers building applications that need fast and efficient access to text embedding models.
No commits in the last 6 months.
Use this if you are a developer looking for a fast, simple, and dependency-free way to expose a text embedding model (like BAAI/bge-m3) via a gRPC API, abstracting away the complexities of the Triton Inference Server.
Not ideal if you need to serve non-embedding models, require custom pre-processing logic beyond simple text conversion, or are not comfortable with Rust or Docker deployments.
Stars
21
Forks
3
Language
Rust
License
Apache-2.0
Category
Last pushed
Aug 10, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/kozistr/triton-grpc-proxy-rs"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Anush008/fastembed-rs
Rust library for vector embeddings and reranking.
huggingface/text-embeddings-inference
A blazing fast inference solution for text embeddings models
MinishLab/model2vec-rs
Official Rust Implementation of Model2Vec
finalfusion/finalfusion-rust
finalfusion embeddings in Rust
finalfusion/finalfusion-python
Finalfusion embeddings in Python