mratsim/laser
The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
When working with large datasets, especially for calculations like matrix operations or image processing, this project helps you speed up computations by leveraging your computer's hardware more efficiently. It takes numerical data (like matrices or images) and processes them much faster, outputting the results you need. This is for software developers, particularly those building high-performance numerical applications or machine learning frameworks.
293 stars. No commits in the last 6 months.
Use this if you are a developer aiming to build or optimize computationally intensive applications that process large numerical data structures on CPUs and accelerators, and you need fine-grained control over performance.
Not ideal if you are an end-user looking for an out-of-the-box application, or if you are a developer working primarily with high-level languages without a focus on low-level performance optimization.
Stars
293
Forks
15
Language
Nim
License
Apache-2.0
Category
Last pushed
Jan 04, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/mratsim/laser"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
iree-org/iree
A retargetable MLIR-based machine learning compiler and runtime toolkit.
brucefan1983/GPUMD
Graphics Processing Units Molecular Dynamics
uxlfoundation/oneDAL
oneAPI Data Analytics Library (oneDAL)
rapidsai/cuml
cuML - RAPIDS Machine Learning Library
NVIDIA/cutlass
CUDA Templates and Python DSLs for High-Performance Linear Algebra