sorcero/ingestum

Read-only mirror of https://gitlab.com/sorcero/community/ingestum

20
/ 100
Experimental

When you need to analyze or compare information from many different places like PDFs, HTML pages, images, or even social media feeds, this tool helps you get all that varied content into a clean, uniform text format. It takes diverse source materials and outputs standardized text documents, ready for tasks like document comparison, search, or automated tagging. This is for anyone who works with information from many different sources and needs to process it consistently.

No commits in the last 6 months.

Use this if you regularly work with content from various formats (like PDFs, HTML, or even audio files) and need to convert them into plain, searchable text for analysis or further processing.

Not ideal if you only work with already-clean text documents and don't need to extract or transform content from diverse file types.

content-extraction document-processing information-retrieval data-preparation text-normalization
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

7

Forks

Language

Python

License

LGPL-3.0

Last pushed

Jan 23, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/sorcero/ingestum"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.