mn-cs/fineweb-spark
FineWeb-Edu dataset analysis using Apache Spark - DSC 232R group project
Stars
—
Forks
—
Language
Jupyter Notebook
License
—
Category
Last pushed
Mar 24, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/mn-cs/fineweb-spark"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
lensacom/sparkit-learn
PySpark + Scikit-learn = Sparkit-learn
Angel-ML/angel
A Flexible and Powerful Parameter Server for large-scale machine learning
flink-extended/dl-on-flink
Deep Learning on Flink aims to integrate Flink and deep learning frameworks (e.g. TensorFlow,...
tirthajyoti/Spark-with-Python
Fundamentals of Spark with Python (using PySpark), code examples
jadianes/spark-py-notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython...