zaratsian/Spark

Apache Spark (Scala, PySpark, SparkR) Code, Tricks, and References

37
/ 100
Emerging

This collection provides practical Apache Spark code snippets and scripts to help data engineers and data scientists efficiently process and analyze large datasets. It includes examples primarily in PySpark, along with Scala and SparkR, to streamline big data workflows. Users can find code solutions to common Spark challenges and leverage them in their data processing tasks.

No commits in the last 6 months.

Use this if you are a data engineer or data scientist looking for ready-to-use Spark code examples to jumpstart your big data projects or troubleshoot specific issues.

Not ideal if you are new to Spark and seeking a comprehensive introductory tutorial or a conceptual guide, as this repository focuses on practical code rather than foundational learning.

big-data-processing data-engineering data-analysis machine-learning-engineering distributed-computing
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 8 / 25
Community 21 / 25

How are scores calculated?

Stars

69

Forks

37

Language

Jupyter Notebook

License

Last pushed

Jan 21, 2019

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/zaratsian/Spark"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.