AMDResearch/NPUEval
NPUEval is an LLM evaluation dataset written specifically to target AIE kernel code generation on RyzenAI hardware.
This is a dataset designed to evaluate how well Large Language Models (LLMs) can generate specialized code for AMD's RyzenAI hardware's Neural Processing Unit (NPU). It provides a set of prompts as input and allows you to test the generated code's functional correctness on AIE kernel architecture. AI software developers and researchers working with AMD's RyzenAI platform would use this to benchmark and improve LLM code generation for NPU applications.
Use this if you are developing or evaluating LLMs that generate low-level code for AMD's AI Engine (AIE) kernels on RyzenAI hardware and need a standardized way to measure their performance.
Not ideal if you are working with NPU hardware other than AMD's AIE2/AIE2P or if your focus is on high-level application development rather than kernel-level code generation.
Stars
30
Forks
4
Language
C++
License
—
Category
Last pushed
Nov 08, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/AMDResearch/NPUEval"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
EvolvingLMMs-Lab/lmms-eval
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
vibrantlabsai/ragas
Supercharge Your LLM Application Evaluations 🚀
open-compass/VLMEvalKit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
EuroEval/EuroEval
The robust European language model benchmark.
Giskard-AI/giskard-oss
🐢 Open-Source Evaluation & Testing library for LLM Agents