eto-ai/rikai

Parquet-based ML data format optimized for working with unstructured data

52
/ 100
Established

Rikai helps AI practitioners manage large collections of unstructured data like images or videos for machine learning projects. It takes raw media files and annotations, organizes them into a structured format, and outputs readily usable datasets for model training or analysis using tools like PyTorch or Spark. This is for machine learning engineers, data scientists, and AI researchers who work with computer vision or other unstructured data types.

141 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need a streamlined way to handle, query, and prepare vast amounts of unstructured data for your AI models, especially when working with Spark.

Not ideal if your primary data consists only of structured tables, or if you prefer not to use Apache Spark for your data processing.

computer-vision machine-learning-engineering unstructured-data-management model-training data-preparation
Stale 6m
Maintenance 0 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 17 / 25

How are scores calculated?

Stars

141

Forks

22

Language

Jupyter Notebook

License

Apache-2.0

Last pushed

Jan 05, 2023

Commits (30d)

0

Dependencies

12

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/eto-ai/rikai"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.