SAP-samples/cross-language-detection-artifacts

This repository complements our paper by offering the training dataset, the best-performing models utilized in our real-world experiment, the list of identified malicious packages, and the scripts necessary to replicate and verify our results.

/ 100

Experimental

This project helps security analysts and software supply chain experts identify malicious code packages in NPM and PyPI repositories. It provides pre-trained models and a dataset to classify packages as benign or malicious based on code features and metadata. The output is a classification of packages and insights into their suspicious characteristics.

No commits in the last 6 months.

Use this if you need to detect potentially harmful JavaScript (npm) or Python (PyPI) packages by analyzing their source code and metadata, especially to understand the types of features that indicate malicious intent.

Not ideal if you need a real-time, production-ready malicious package detection system or if your focus is on languages other than JavaScript and Python.

software-supply-chain-security malware-detection package-security code-auditing cybersecurity-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

Apache-2.0

Higher-rated alternatives

apache/texera

Collaborative Machine-Learning-Centric Data Analytics Using Workflows

UBC-NLP/afrolid

AfroLID, a powerful neural toolkit for African languages identification which covers 517 African...

asyml/texar-pytorch

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and...

stevezheng23/xlnet_extension_tf

XLNet Extension in TensorFlow

jayavardhanr/End-to-end-Sequence-Labeling-via-Bi-directional-LSTM-CNNs-CRF-Tutorial

Tutorial for End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

Explore ML Frameworks

All categories Trending ML Framework directory Insights