punit-naik/MLHadoop
This repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.
This project offers fundamental machine learning algorithms re-implemented for the Hadoop MapReduce framework. It takes raw datasets and applies classic methods like linear regression, k-means clustering, or k-nearest neighbors to produce predictions, groupings, or classifications. A developer or data engineer working with large-scale, distributed data processing on Hadoop would use these implementations.
Use this if you are a Hadoop developer looking for basic machine learning algorithms to integrate directly into your MapReduce jobs without external libraries.
Not ideal if you need high-performance, optimized machine learning on big data or if you prefer using established, robust ML libraries within the Hadoop ecosystem (like Spark MLlib or Mahout).
Stars
58
Forks
37
Language
Java
License
Apache-2.0
Category
Last pushed
Jan 29, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/punit-naik/MLHadoop"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
o19s/elasticsearch-learning-to-rank
Plugin to integrate Learning to Rank (aka machine learning for better relevance) with Elasticsearch
oracle/tribuo
Tribuo - A Java machine learning library
Waikato/meka
Multi-label classifiers and evaluation procedures using the Weka machine learning framework.
Waikato/moa
MOA is an open source framework for Big Data stream mining. It includes a collection of machine...
allegro/allRank
allRank is a framework for training learning-to-rank neural models based on PyTorch.