NVIDIA-Merlin/NVTabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

67
/ 100
Established

This tool helps data scientists and machine learning engineers prepare massive datasets, often terabytes in size, for building deep learning recommendation systems. It takes raw tabular data and transforms it into optimized features, accelerating the entire process on GPUs. The output is ready-to-train data that significantly speeds up model development and iteration.

1,141 stars. Available on PyPI.

Use this if you are a data scientist or ML engineer working with extremely large tabular datasets (terabytes) to build high-performance deep learning recommender systems and want to accelerate data preparation using GPUs.

Not ideal if you are working with small datasets, non-tabular data, or do not have access to NVIDIA GPUs for accelerated processing.

recommender-systems feature-engineering big-data-processing machine-learning-engineering deep-learning
Maintenance 10 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 22 / 25

How are scores calculated?

Stars

1,141

Forks

149

Language

Python

License

Apache-2.0

Last pushed

Mar 12, 2026

Commits (30d)

0

Dependencies

3

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/NVIDIA-Merlin/NVTabular"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.