treeverse/dvc
🦉 Data Versioning and ML Experiments
This tool helps machine learning practitioners manage and version their large datasets and models, much like Git versions code. You feed it your machine learning project's code, data, and models, and it helps you track changes, reproduce experiments, and manage your data on cloud storage. This is ideal for data scientists, ML engineers, and researchers working on reproducible AI projects.
15,443 stars. Used by 5 other packages. Actively maintained with 6 commits in the last 30 days. Available on PyPI.
Use this if you need to version large datasets and machine learning models alongside your code to ensure experiment reproducibility and collaborative development.
Not ideal if you are working with small, static datasets that don't change often or if you primarily need version control for code-only projects.
Stars
15,443
Forks
1,282
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 09, 2026
Commits (30d)
6
Dependencies
42
Reverse dependents
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/treeverse/dvc"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related frameworks
runpod/runpod-python
🐍 | Python library for RunPod API and serverless worker SDK.
microsoft/vscode-jupyter
VS Code Jupyter extension
4paradigm/OpenMLDB
OpenMLDB is an open-source machine learning database that provides a feature platform computing...
uber/petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning...
deepchecks/deepchecks
Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic...