alibaba/CloudEval-YAML

☁️ Benchmarking LLMs for Cloud Config Generation | 云场景下的大模型基准测试

26
/ 100
Experimental

This project helps cloud engineers and application developers evaluate how well different large language models (LLMs) can generate accurate YAML configurations for cloud-native applications like Kubernetes or Envoy. You provide an LLM, and it outputs performance scores like BLEU and exact match, showing how precisely the LLM generates the required YAML files. This is used by anyone responsible for deploying and managing applications in a cloud environment.

No commits in the last 6 months.

Use this if you need to objectively compare and select the best LLM or prompt engineering technique for automating cloud configuration tasks.

Not ideal if you are looking for a tool to deploy or manage cloud applications directly, rather than evaluate the LLMs that generate their configurations.

cloud-engineering DevOps configuration-management LLM-evaluation cloud-native
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 3 / 25

How are scores calculated?

Stars

39

Forks

1

Language

Python

License

MIT

Last pushed

Oct 25, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/mlops/alibaba/CloudEval-YAML"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.