punit-naik/MLHadoop

This repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.

/ 100

Established

This project offers fundamental machine learning algorithms re-implemented for the Hadoop MapReduce framework. It takes raw datasets and applies classic methods like linear regression, k-means clustering, or k-nearest neighbors to produce predictions, groupings, or classifications. A developer or data engineer working with large-scale, distributed data processing on Hadoop would use these implementations.

Use this if you are a Hadoop developer looking for basic machine learning algorithms to integrate directly into your MapReduce jobs without external libraries.

Not ideal if you need high-performance, optimized machine learning on big data or if you prefer using established, robust ML libraries within the Hadoop ecosystem (like Spark MLlib or Mahout).

Hadoop development MapReduce programming Distributed computing Big data analytics Machine learning implementation

No Package No Dependents

Maintenance 10 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

Forks

Language

Java

License

Apache-2.0

Related frameworks

o19s/elasticsearch-learning-to-rank

Plugin to integrate Learning to Rank (aka machine learning for better relevance) with Elasticsearch

oracle/tribuo

Tribuo - A Java machine learning library

Waikato/meka

Multi-label classifiers and evaluation procedures using the Weka machine learning framework.

Waikato/moa

MOA is an open source framework for Big Data stream mining. It includes a collection of machine...

allegro/allRank

allRank is a framework for training learning-to-rank neural models based on PyTorch.

Explore ML Frameworks

All categories Trending ML Framework directory Insights