aws-samples/aws-inferentia-huggingface-workshop

CMP314 Optimizing NLP models with Amazon EC2 Inf1 instances in Amazon Sagemaker

36
/ 100
Emerging

This project helps machine learning engineers and MLOps practitioners deploy Natural Language Processing (NLP) models, specifically for tasks like paraphrase detection. It guides you through setting up and comparing the performance of a HuggingFace NLP model deployed on standard CPU instances versus specialized AWS Inferentia (Inf1) instances within Amazon SageMaker. You input a HuggingFace model and receive performance metrics (latency and throughput) to understand the benefits of hardware acceleration.

No commits in the last 6 months.

Use this if you are a machine learning engineer looking to optimize the inference performance and cost-efficiency of your NLP models in production using AWS Inferentia instances.

Not ideal if you are a data scientist primarily focused on model training and experimentation rather than deployment optimization, or if you are not using AWS SageMaker.

NLP deployment MLOps model optimization inference acceleration cloud machine learning
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

14

Forks

4

Language

Jupyter Notebook

License

MIT-0

Last pushed

Dec 20, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/aws-samples/aws-inferentia-huggingface-workshop"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.