ucbepic/docetl

A system for agentic LLM-powered data processing and ETL

59
/ 100
Established

DocETL helps you build automated workflows to extract, transform, and load information from complex documents. You input raw documents, and it helps you configure AI-powered steps to produce structured data or transformed content. This is for anyone who needs to process large volumes of text documents efficiently, like researchers, data analysts, or operations managers.

3,686 stars. Actively maintained with 1 commit in the last 30 days.

Use this if you regularly work with unstructured text documents and need a reliable, customizable way to pull out specific information or transform them for further use.

Not ideal if your primary task involves processing structured data like spreadsheets or databases, or if you only occasionally process simple documents.

document-processing data-extraction text-analysis workflow-automation information-retrieval
No Package No Dependents
Maintenance 13 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

3,686

Forks

385

Language

Python

License

MIT

Last pushed

Mar 12, 2026

Commits (30d)

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/ucbepic/docetl"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.