treeverse/dvc

🦉 Data Versioning and ML Experiments

76
/ 100
Verified

This tool helps machine learning practitioners manage and version their large datasets and models, much like Git versions code. You feed it your machine learning project's code, data, and models, and it helps you track changes, reproduce experiments, and manage your data on cloud storage. This is ideal for data scientists, ML engineers, and researchers working on reproducible AI projects.

15,443 stars. Used by 5 other packages. Actively maintained with 6 commits in the last 30 days. Available on PyPI.

Use this if you need to version large datasets and machine learning models alongside your code to ensure experiment reproducibility and collaborative development.

Not ideal if you are working with small, static datasets that don't change often or if you primarily need version control for code-only projects.

machine-learning-ops data-versioning ml-experiment-tracking reproducible-ai data-science-workflow
Maintenance 17 / 25
Adoption 15 / 25
Maturity 25 / 25
Community 19 / 25

How are scores calculated?

Stars

15,443

Forks

1,282

Language

Python

License

Apache-2.0

Last pushed

Mar 09, 2026

Commits (30d)

6

Dependencies

42

Reverse dependents

5

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/treeverse/dvc"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Compare