EdVince/whisper-trtllm

Whisper in TensorRT-LLM

23
/ 100
Experimental

This project helps developers and MLOps engineers optimize the deployment of OpenAI's Whisper speech-to-text models. It takes the original Whisper model and converts it into an optimized format for faster inference on NVIDIA GPUs. The output is a high-performance speech recognition engine that maintains accuracy while significantly speeding up transcription tasks. This is ideal for those building or deploying speech recognition applications at scale.

No commits in the last 6 months.

Use this if you are a developer or MLOps engineer looking to deploy OpenAI's Whisper model for English speech recognition with improved inference speed and efficiency on NVIDIA TensorRT-LLM.

Not ideal if you are an end-user simply looking for a speech-to-text application without needing to optimize model deployment or if you require support for languages other than English.

speech-to-text model-optimization GPU-acceleration MLOps AI-deployment
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 8 / 25
Community 9 / 25

How are scores calculated?

Stars

17

Forks

2

Language

C++

License

Last pushed

Sep 21, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/EdVince/whisper-trtllm"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.