NVIDIA-Merlin/NVTabular
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
This tool helps data scientists and machine learning engineers prepare massive datasets, often terabytes in size, for building deep learning recommendation systems. It takes raw tabular data and transforms it into optimized features, accelerating the entire process on GPUs. The output is ready-to-train data that significantly speeds up model development and iteration.
1,141 stars. Available on PyPI.
Use this if you are a data scientist or ML engineer working with extremely large tabular datasets (terabytes) to build high-performance deep learning recommender systems and want to accelerate data preparation using GPUs.
Not ideal if you are working with small datasets, non-tabular data, or do not have access to NVIDIA GPUs for accelerated processing.
Stars
1,141
Forks
149
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 12, 2026
Commits (30d)
0
Dependencies
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/NVIDIA-Merlin/NVTabular"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
PriorLabs/TabPFN
⚡ TabPFN: Foundation Model for Tabular Data ⚡
pyg-team/pytorch-frame
Tabular Deep Learning Library for PyTorch
PriorLabs/tabpfn-extensions
Community extensions for TabPFN - the foundation model for tabular data. Built with TabPFN! 🤗
pytorch-tabular/pytorch_tabular
A unified framework for Deep Learning Models on tabular data
soda-inria/tabicl
TabICLv2: A state-of-the-art tabular foundation model