manuparra/taller_SparkR

Taller SparkR para las Jornadas de Usuarios de R

/ 100

Emerging

This workshop material helps data analysts and scientists process extremely large datasets using R and Apache Spark. It takes raw, massive datasets (like CSV, JSON, Parquet files) and shows you how to filter, aggregate, transform, and analyze them to produce insights, machine learning models, and visualizations. The material is designed for someone who works with data in R and needs to scale up to 'big data' problems.

No commits in the last 6 months.

Use this if you are an R user struggling to analyze very large datasets that exceed the memory capacity of a single machine and need to leverage distributed computing.

Not ideal if you primarily work with smaller datasets that fit within your computer's memory or if you prefer programming languages other than R.

big-data-analytics data-science statistical-modeling machine-learning data-visualization

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 19 / 25

How are scores calculated?

Stars

Forks

Language

HTML

License

—

Higher-rated alternatives

lensacom/sparkit-learn

PySpark + Scikit-learn = Sparkit-learn

Angel-ML/angel

A Flexible and Powerful Parameter Server for large-scale machine learning

flink-extended/dl-on-flink

Deep Learning on Flink aims to integrate Flink and deep learning frameworks (e.g. TensorFlow,...

tirthajyoti/Spark-with-Python

Fundamentals of Spark with Python (using PySpark), code examples

jadianes/spark-py-notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython...

Explore ML Frameworks

All categories Trending ML Framework directory Insights