ofirbsh/ai_embedder_engine

AI Embedder Engine: An open-source Python engine for generating embeddings from PDFs, storing them in Parquet, and indexing with FAISS for semantic search.

29
/ 100
Experimental

This project helps you turn large PDF documents into organized, searchable data. You provide your PDFs, and it outputs structured data files containing numerical representations (embeddings) of your document content, ready for advanced searching. This is ideal for researchers, legal professionals, technical writers, or anyone who needs to quickly find specific information within a large collection of domain-specific documents.

No commits in the last 6 months.

Use this if you need to transform many PDF documents into a format that enables powerful semantic search or integration with AI systems like RAG (Retrieval Augmented Generation).

Not ideal if you only need to perform simple keyword searches or if your documents are not primarily text-based PDFs.

information-retrieval document-management legal-research medical-information technical-documentation
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 4 / 25
Maturity 15 / 25
Community 8 / 25

How are scores calculated?

Stars

8

Forks

1

Language

Python

License

Last pushed

Aug 18, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/ofirbsh/ai_embedder_engine"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.