COM6012/ScalableML
COM6012 Scalable Machine Learning - University of Sheffield. Enjoy our resources? ⭐ Star this repository to show your support and help others discover it!
This resource helps postgraduate students at the University of Sheffield learn how to apply machine learning techniques to very large datasets using Apache Spark. It provides practical guidance and materials for designing and implementing scalable machine learning models. The output is a deeper understanding of large-scale ML workflows, specifically for students enrolled in the COM6012 module.
Use this if you are a University of Sheffield COM6012 student looking for structured learning materials and guidance on scalable machine learning with Apache Spark.
Not ideal if you are an industry practitioner or general learner seeking a self-paced, project-based introduction to Spark without the specific university context.
Stars
96
Forks
86
Language
HTML
License
—
Category
Last pushed
Mar 18, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/COM6012/ScalableML"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
skrub-data/skrub
Machine learning with dataframes
biolab/orange3
🍊 :bar_chart: :bulb: Orange: Interactive data analysis
root-project/root
The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
cleanlab/cleanlab
Cleanlab's open-source library is the standard data-centric AI package for data quality and...
drivendataorg/deon
A command line tool to easily add an ethics checklist to your data science projects.