fireindark707/Python-Schema-Matching
A python tool using XGboost and sentence-transformers to perform schema matching task on tables.
This tool helps data professionals quickly find connections between columns in different datasets, even when column names are confusing or use different languages. You provide two tables (CSV, JSON, or JSONL files), and it outputs a matrix showing which columns in one table likely correspond to columns in the other. This is for anyone who frequently needs to integrate or compare information from various tabular data sources, like data analysts, data scientists, or business intelligence specialists.
Available on PyPI.
Use this if you need to automatically identify matching columns across multiple large datasets with varying naming conventions or structures.
Not ideal if you only deal with very small, manually manageable datasets or if you require very precise control over every single column mapping through a graphical interface.
Stars
40
Forks
13
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 08, 2026
Commits (30d)
0
Dependencies
7
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/fireindark707/Python-Schema-Matching"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
Cloud-CV/EvalAI
:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI
graphbookai/graphbook
Visual AI development framework for training and inference of ML models, scaling pipelines, and...
visual-layer/fastdup
fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and...
github/CodeSearchNet
Datasets, tools, and benchmarks for representation learning of code.
tthtlc/awesome-source-analysis
Source code understanding via Machine Learning techniques