punit-naik/MLHadoop

This repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.

55
/ 100
Established

This project offers fundamental machine learning algorithms re-implemented for the Hadoop MapReduce framework. It takes raw datasets and applies classic methods like linear regression, k-means clustering, or k-nearest neighbors to produce predictions, groupings, or classifications. A developer or data engineer working with large-scale, distributed data processing on Hadoop would use these implementations.

Use this if you are a Hadoop developer looking for basic machine learning algorithms to integrate directly into your MapReduce jobs without external libraries.

Not ideal if you need high-performance, optimized machine learning on big data or if you prefer using established, robust ML libraries within the Hadoop ecosystem (like Spark MLlib or Mahout).

Hadoop development MapReduce programming Distributed computing Big data analytics Machine learning implementation
No Package No Dependents
Maintenance 10 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 21 / 25

How are scores calculated?

Stars

58

Forks

37

Language

Java

License

Apache-2.0

Last pushed

Jan 29, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/punit-naik/MLHadoop"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.