sajidkhan2067/LLMOnAWS

Deploy smaller LLM on AWS Lambda: Phi-2, cost-effective language model

/ 100

Experimental

This project helps developers and MLOps engineers deploy smaller large language models (LLMs) like Microsoft Phi-2 onto AWS Lambda for cost-effective inference. It takes an open-source LLM and Docker configuration as input, and outputs a deployed, functional LLM endpoint on AWS Lambda. This is for users who need to run custom LLMs in a serverless environment, often due to data sensitivity or specific language requirements.

No commits in the last 6 months.

Use this if you are a developer or MLOps engineer looking for a cost-efficient way to host smaller open-source LLMs on a serverless AWS infrastructure.

Not ideal if you prefer managed LLM services or do not have experience with AWS, Docker, and Python development.

serverless-ml llm-deployment aws-lambda mlops custom-ai-models

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 8 / 25

Community 13 / 25

How are scores calculated?

Stars

Forks

Language

Shell

License

—

Higher-rated alternatives

PaddlePaddle/FastDeploy

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

mlc-ai/mlc-llm

Universal LLM Deployment Engine with ML Compilation

skyzh/tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny...

ServerlessLLM/ServerlessLLM

Serverless LLM Serving for Everyone.

AXERA-TECH/ax-llm

Explore LLM model deployment based on AXera's AI chips

Explore Transformer Models

All categories Trending Transformer directory Insights