kili-technology/awesome-datasets

A comprehensive list of annotated training datasets classified by use case.

29
/ 100
Experimental

This resource provides a curated list of high-quality, pre-annotated datasets for various real-world AI applications. It helps data scientists and AI practitioners find suitable data for tasks like speech recognition, document processing (e.g., classifying invoices, extracting information from contracts), and image analysis (e.g., medical image segmentation). You can input your specific problem area and receive a list of relevant datasets, often with previews and links to the data.

No commits in the last 6 months.

Use this if you are an AI practitioner, data scientist, or researcher looking for readily available, annotated datasets to train or evaluate machine learning models for document processing, speech recognition, or image analysis.

Not ideal if you need to create custom annotations for your own unique data, or if you are looking for general-purpose, unannotated raw data.

speech-recognition document-classification information-extraction image-segmentation natural-language-processing
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 14 / 25

How are scores calculated?

Stars

38

Forks

6

Language

License

Last pushed

Jul 08, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/kili-technology/awesome-datasets"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.