sajidkhan2067/LLMOnAWS
Deploy smaller LLM on AWS Lambda: Phi-2, cost-effective language model
This project helps developers and MLOps engineers deploy smaller large language models (LLMs) like Microsoft Phi-2 onto AWS Lambda for cost-effective inference. It takes an open-source LLM and Docker configuration as input, and outputs a deployed, functional LLM endpoint on AWS Lambda. This is for users who need to run custom LLMs in a serverless environment, often due to data sensitivity or specific language requirements.
No commits in the last 6 months.
Use this if you are a developer or MLOps engineer looking for a cost-efficient way to host smaller open-source LLMs on a serverless AWS infrastructure.
Not ideal if you prefer managed LLM services or do not have experience with AWS, Docker, and Python development.
Stars
8
Forks
2
Language
Shell
License
—
Category
Last pushed
Feb 06, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/sajidkhan2067/LLMOnAWS"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PaddlePaddle/FastDeploy
High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle
mlc-ai/mlc-llm
Universal LLM Deployment Engine with ML Compilation
skyzh/tiny-llm
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny...
ServerlessLLM/ServerlessLLM
Serverless LLM Serving for Everyone.
AXERA-TECH/ax-llm
Explore LLM model deployment based on AXera's AI chips