ucbepic/docetl
A system for agentic LLM-powered data processing and ETL
DocETL helps you build automated workflows to extract, transform, and load information from complex documents. You input raw documents, and it helps you configure AI-powered steps to produce structured data or transformed content. This is for anyone who needs to process large volumes of text documents efficiently, like researchers, data analysts, or operations managers.
3,686 stars. Actively maintained with 1 commit in the last 30 days.
Use this if you regularly work with unstructured text documents and need a reliable, customizable way to pull out specific information or transform them for further use.
Not ideal if your primary task involves processing structured data like spreadsheets or databases, or if you only occasionally process simple documents.
Stars
3,686
Forks
385
Language
Python
License
MIT
Category
Last pushed
Mar 12, 2026
Commits (30d)
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/ucbepic/docetl"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.