google/magika

Fast and accurate AI powered file content types detection

73
/ 100
Verified

Magika quickly and accurately identifies the true content type of various files, from documents and code to binary data, even when file extensions are missing or incorrect. You feed it files, and it tells you exactly what kind of content is inside, like 'Microsoft Word document' or 'Python source code'. This is ideal for security analysts, data managers, or anyone needing to categorize large collections of files for safety or organization.

10,151 stars. Used by 4 other packages. Actively maintained with 11 commits in the last 30 days. Available on PyPI.

Use this if you need to precisely identify the content type of many files to ensure proper handling, routing, or security scanning.

Not ideal if you only need to identify basic file types by extension and don't require deep content inspection or high accuracy for diverse and potentially malicious files.

file-type-identification malware-analysis data-classification document-processing digital-forensics
Maintenance 17 / 25
Adoption 14 / 25
Maturity 25 / 25
Community 17 / 25

How are scores calculated?

Stars

10,151

Forks

495

Language

Python

License

Apache-2.0

Last pushed

Mar 03, 2026

Commits (30d)

11

Dependencies

2

Reverse dependents

4

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/google/magika"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.