NVIDIA-Merlin/NVTabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

/ 100

Established

This tool helps data scientists and machine learning engineers prepare massive datasets, often terabytes in size, for building deep learning recommendation systems. It takes raw tabular data and transforms it into optimized features, accelerating the entire process on GPUs. The output is ready-to-train data that significantly speeds up model development and iteration.

1,141 stars. Available on PyPI.

Use this if you are a data scientist or ML engineer working with extremely large tabular datasets (terabytes) to build high-performance deep learning recommender systems and want to accelerate data preparation using GPUs.

Not ideal if you are working with small datasets, non-tabular data, or do not have access to NVIDIA GPUs for accelerated processing.

recommender-systems feature-engineering big-data-processing machine-learning-engineering deep-learning

Maintenance 10 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 22 / 25

How are scores calculated?

Stars

1,141

Forks

149

Language

Python

License

Apache-2.0

Related frameworks

PriorLabs/TabPFN

⚡ TabPFN: Foundation Model for Tabular Data ⚡

pyg-team/pytorch-frame

Tabular Deep Learning Library for PyTorch

PriorLabs/tabpfn-extensions

Community extensions for TabPFN - the foundation model for tabular data. Built with TabPFN! 🤗

pytorch-tabular/pytorch_tabular

A unified framework for Deep Learning Models on tabular data

soda-inria/tabicl

TabICLv2: A state-of-the-art tabular foundation model

Explore ML Frameworks

All categories Trending ML Framework directory Insights