a-agmon/dfembeder

DF Embedder is a high-performance Python library (with a Rust backend) for indexing and embedding Apache Arrow compatible DataFrames (like Polars or Pandas) into low latency vector databases based on Lance files.

22
/ 100
Experimental

This tool helps data professionals quickly prepare large datasets of text for semantic search. You input a spreadsheet or table containing text, and it generates an optimized database ready for finding similar records based on meaning, not just keywords. It's ideal for data scientists, analysts, or anyone managing extensive text-based information.

No commits in the last 6 months.

Use this if you need to rapidly turn massive tables of text data into a searchable format for semantic similarity, especially when dealing with millions of records and requiring high performance.

Not ideal if your data is not primarily textual or if you require highly custom, state-of-the-art embedding models beyond the efficient static one provided.

data-preparation semantic-search text-analysis large-scale-data information-retrieval
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

8

Forks

Language

Rust

License

MIT

Last pushed

Aug 13, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/a-agmon/dfembeder"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.