SAP-samples/cross-language-detection-artifacts
This repository complements our paper by offering the training dataset, the best-performing models utilized in our real-world experiment, the list of identified malicious packages, and the scripts necessary to replicate and verify our results.
This project helps security analysts and software supply chain experts identify malicious code packages in NPM and PyPI repositories. It provides pre-trained models and a dataset to classify packages as benign or malicious based on code features and metadata. The output is a classification of packages and insights into their suspicious characteristics.
No commits in the last 6 months.
Use this if you need to detect potentially harmful JavaScript (npm) or Python (PyPI) packages by analyzing their source code and metadata, especially to understand the types of features that indicate malicious intent.
Not ideal if you need a real-time, production-ready malicious package detection system or if your focus is on languages other than JavaScript and Python.
Stars
21
Forks
—
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 07, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/SAP-samples/cross-language-detection-artifacts"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
apache/texera
Collaborative Machine-Learning-Centric Data Analytics Using Workflows
UBC-NLP/afrolid
AfroLID, a powerful neural toolkit for African languages identification which covers 517 African...
asyml/texar-pytorch
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and...
stevezheng23/xlnet_extension_tf
XLNet Extension in TensorFlow
jayavardhanr/End-to-end-Sequence-Labeling-via-Bi-directional-LSTM-CNNs-CRF-Tutorial
Tutorial for End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF