Metadata-Version: 2.4
Name: modelaudit
Version: 0.2.26
Summary: Static scanning library for detecting malicious code, backdoors, and other security risks in ML model files
Project-URL: Repository, https://github.com/promptfoo/modelaudit
Project-URL: Homepage, https://github.com/promptfoo/modelaudit
Project-URL: Bug Tracker, https://github.com/promptfoo/modelaudit/issues
Project-URL: Changelog, https://github.com/promptfoo/modelaudit/blob/main/CHANGELOG.md
Author-email: Ian Webster <ian@promptfoo.dev>, Michael D'Angelo <michael@promptfoo.dev>, Yash Chhabria <yash@promptfoo.dev>
License: MIT
License-File: LICENSE
Keywords: ai,ml,model-scanning,pickle,pytorch,security,tensorflow
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Security
Requires-Python: >=3.10
Requires-Dist: click>=8.1.7
Requires-Dist: cyclonedx-python-lib>=11.0.0
Requires-Dist: defusedxml>=0.7.1
Requires-Dist: fsspec>=2025.5.1
Requires-Dist: gcsfs>=2025.5.1
Requires-Dist: huggingface-hub>=0.23.0
Requires-Dist: numpy<2.0,>=1.19.0; python_version == '3.10'
Requires-Dist: numpy<2.5,>=2.4; python_version >= '3.11'
Requires-Dist: platformdirs>=3.0.0
Requires-Dist: posthog>=7.0.0
Requires-Dist: protobuf>=5.29.0
Requires-Dist: pydantic<3.0,>=2.11.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: pyyaml<7.0,>=6.0
Requires-Dist: requests>=2.28.0
Requires-Dist: s3fs>=2025.5.1
Requires-Dist: scipy>=1.7.0
Requires-Dist: yaspin>=2.5.0
Provides-Extra: all
Requires-Dist: dill<1.0,>=0.3.0; extra == 'all'
Requires-Dist: h5py<4.0,>=3.1; extra == 'all'
Requires-Dist: huggingface-hub>=0.23.0; extra == 'all'
Requires-Dist: joblib<2.0,>=1.0.0; extra == 'all'
Requires-Dist: mlflow>=2.12.0; extra == 'all'
Requires-Dist: msgpack<2.0,>=1.0.0; extra == 'all'
Requires-Dist: onnx<2.0,>=1.12.0; (python_version < '3.13') and extra == 'all'
Requires-Dist: py-ubjson>=0.16.0; extra == 'all'
Requires-Dist: pyyaml<7.0,>=6.0; extra == 'all'
Requires-Dist: safetensors>=0.4.0; extra == 'all'
Requires-Dist: scikit-learn<2.0,>=1.0.0; extra == 'all'
Requires-Dist: tflite>=2.18.0; extra == 'all'
Requires-Dist: torch<3.0,>=2.6.0; extra == 'all'
Requires-Dist: xgboost<3.3,>=3.2; extra == 'all'
Provides-Extra: all-ci
Requires-Dist: dill<1.0,>=0.3.0; extra == 'all-ci'
Requires-Dist: h5py<4.0,>=3.1; extra == 'all-ci'
Requires-Dist: huggingface-hub>=0.23.0; extra == 'all-ci'
Requires-Dist: joblib<2.0,>=1.0.0; extra == 'all-ci'
Requires-Dist: mlflow>=2.12.0; extra == 'all-ci'
Requires-Dist: msgpack<2.0,>=1.0.0; extra == 'all-ci'
Requires-Dist: onnx<2.0,>=1.12.0; (python_version < '3.13') and extra == 'all-ci'
Requires-Dist: py-ubjson>=0.16.0; extra == 'all-ci'
Requires-Dist: pyyaml<7.0,>=6.0; extra == 'all-ci'
Requires-Dist: safetensors>=0.4.0; extra == 'all-ci'
Requires-Dist: scikit-learn<2.0,>=1.0.0; extra == 'all-ci'
Requires-Dist: tflite>=2.18.0; extra == 'all-ci'
Requires-Dist: torch<3.0,>=2.6.0; extra == 'all-ci'
Requires-Dist: xgboost<3.3,>=3.2; extra == 'all-ci'
Provides-Extra: all-ci-windows
Requires-Dist: dill<1.0,>=0.3.0; extra == 'all-ci-windows'
Requires-Dist: joblib<2.0,>=1.0.0; extra == 'all-ci-windows'
Requires-Dist: msgpack<2.0,>=1.0.0; extra == 'all-ci-windows'
Requires-Dist: safetensors>=0.4.0; extra == 'all-ci-windows'
Provides-Extra: dill
Requires-Dist: dill<1.0,>=0.3.0; extra == 'dill'
Provides-Extra: flax
Requires-Dist: msgpack<2.0,>=1.0.0; extra == 'flax'
Provides-Extra: h5
Requires-Dist: h5py<4.0,>=3.1; extra == 'h5'
Provides-Extra: joblib
Requires-Dist: joblib<2.0,>=1.0.0; extra == 'joblib'
Requires-Dist: scikit-learn<2.0,>=1.0.0; extra == 'joblib'
Provides-Extra: mlflow
Requires-Dist: mlflow>=2.12.0; extra == 'mlflow'
Provides-Extra: numpy1
Requires-Dist: dill<1.0,>=0.3.0; extra == 'numpy1'
Requires-Dist: h5py<4.0,>=3.1; extra == 'numpy1'
Requires-Dist: huggingface-hub>=0.23.0; extra == 'numpy1'
Requires-Dist: joblib<2.0,>=1.0.0; extra == 'numpy1'
Requires-Dist: msgpack<2.0,>=1.0.0; extra == 'numpy1'
Requires-Dist: numpy<2.0,>=1.19.0; (python_version == '3.10') and extra == 'numpy1'
Requires-Dist: numpy<2.5,>=2.4; (python_version >= '3.11') and extra == 'numpy1'
Requires-Dist: onnx<2.0,>=1.12.0; (python_version < '3.13') and extra == 'numpy1'
Requires-Dist: pyyaml<7.0,>=6.0; extra == 'numpy1'
Requires-Dist: safetensors>=0.4.0; extra == 'numpy1'
Requires-Dist: scikit-learn<2.0,>=1.0.0; extra == 'numpy1'
Requires-Dist: tflite>=2.18.0; extra == 'numpy1'
Requires-Dist: torch<3.0,>=2.6.0; extra == 'numpy1'
Provides-Extra: onnx
Requires-Dist: onnx<2.0,>=1.12.0; (python_version < '3.13') and extra == 'onnx'
Provides-Extra: pytorch
Requires-Dist: torch<3.0,>=2.6.0; extra == 'pytorch'
Provides-Extra: safetensors
Requires-Dist: safetensors>=0.4.0; extra == 'safetensors'
Provides-Extra: sevenzip
Requires-Dist: py7zr>=0.20.0; extra == 'sevenzip'
Provides-Extra: tensorflow
Requires-Dist: tensorflow<3.0,>=2.17.0; (python_version >= '3.11' and python_version < '3.13') and extra == 'tensorflow'
Provides-Extra: tensorrt
Requires-Dist: tensorrt>=8.6.0; (sys_platform == 'linux' or sys_platform == 'win32') and extra == 'tensorrt'
Provides-Extra: tflite
Requires-Dist: tflite>=2.18.0; extra == 'tflite'
Provides-Extra: xgboost
Requires-Dist: py-ubjson>=0.16.0; extra == 'xgboost'
Requires-Dist: xgboost<3.3,>=3.2; extra == 'xgboost'
Description-Content-Type: text/markdown

# ModelAudit

**Secure your AI models before deployment.** Static scanner that detects malicious code, backdoors, and security vulnerabilities in ML model files — without ever loading or executing them.

[![PyPI version](https://badge.fury.io/py/modelaudit.svg)](https://pypi.org/project/modelaudit/)
[![Python versions](https://img.shields.io/pypi/pyversions/modelaudit.svg)](https://pypi.org/project/modelaudit/)
[![Code Style: ruff](https://img.shields.io/badge/code%20style-ruff-005cd7.svg)](https://github.com/astral-sh/ruff)
[![License](https://img.shields.io/github/license/promptfoo/modelaudit)](https://github.com/promptfoo/modelaudit/blob/main/LICENSE)

<img width="989" alt="ModelAudit scan results" src="https://www.promptfoo.dev/img/docs/modelaudit/modelaudit-result.png" />

**[Full Documentation](https://www.promptfoo.dev/docs/model-audit/)** | **[Usage Examples](https://www.promptfoo.dev/docs/model-audit/usage/)** | **[Supported Formats](https://www.promptfoo.dev/docs/model-audit/scanners/)**

## Quick Start

**Requires Python 3.10+**

```bash
pip install modelaudit[all]

# Scan a file or directory
modelaudit model.pkl
modelaudit ./models/

# Export results for CI/CD
modelaudit model.pkl --format json --output results.json
```

```
$ modelaudit suspicious_model.pkl

Files scanned: 1 | Issues found: 2 critical, 1 warning

1. suspicious_model.pkl (pos 28): [CRITICAL] Malicious code execution attempt
   Why: Contains os.system() call that could run arbitrary commands

2. suspicious_model.pkl (pos 52): [WARNING] Dangerous pickle deserialization
   Why: Could execute code when the model loads
```

## What It Detects

- **Code execution attacks** in Pickle, PyTorch, NumPy, and Joblib files
- **Model backdoors** with hidden functionality or suspicious weight patterns
- **Embedded secrets** — API keys, tokens, and credentials in model weights or metadata
- **Network indicators** — URLs, IPs, and socket usage that could enable data exfiltration
- **Archive exploits** — path traversal, symlink attacks in ZIP/TAR/7z files
- **Unsafe ML operations** — Lambda layers, custom ops, TorchScript/JIT, template injection
- **Supply chain risks** — tampering, license violations, suspicious configurations

## Supported Formats

ModelAudit includes **30 specialized scanners** covering model, archive, and configuration formats:

| Format           | Extensions                            | Risk   |
| ---------------- | ------------------------------------- | ------ |
| **Pickle**       | `.pkl`, `.pickle`, `.dill`            | HIGH   |
| **PyTorch**      | `.pt`, `.pth`, `.ckpt`, `.bin`        | HIGH   |
| **Joblib**       | `.joblib`                             | HIGH   |
| **NumPy**        | `.npy`, `.npz`                        | HIGH   |
| **TensorFlow**   | `.pb`, SavedModel dirs                | MEDIUM |
| **Keras**        | `.h5`, `.hdf5`, `.keras`              | MEDIUM |
| **ONNX**         | `.onnx`                               | MEDIUM |
| **XGBoost**      | `.bst`, `.model`, `.ubj`              | MEDIUM |
| **SafeTensors**  | `.safetensors`                        | LOW    |
| **GGUF/GGML**    | `.gguf`, `.ggml`                      | LOW    |
| **JAX/Flax**     | `.msgpack`, `.flax`, `.orbax`, `.jax` | LOW    |
| **TFLite**       | `.tflite`                             | LOW    |
| **ExecuTorch**   | `.ptl`, `.pte`                        | LOW    |
| **TensorRT**     | `.engine`, `.plan`                    | LOW    |
| **PaddlePaddle** | `.pdmodel`, `.pdiparams`              | LOW    |
| **OpenVINO**     | `.xml`                                | LOW    |
| **Skops**        | `.skops`                              | HIGH   |
| **PMML**         | `.pmml`                               | LOW    |

Plus scanners for ZIP, TAR, 7-Zip, OCI layers, Jinja2 templates, JSON/YAML metadata, manifests, and text files.

[View complete format documentation](https://www.promptfoo.dev/docs/model-audit/scanners/)

## Remote Sources

Scan models directly from remote registries and cloud storage:

```bash
# Hugging Face
modelaudit https://huggingface.co/gpt2
modelaudit hf://microsoft/DialoGPT-medium

# Cloud storage
modelaudit s3://bucket/model.pt
modelaudit gs://bucket/models/

# MLflow registry
modelaudit models:/MyModel/Production

# JFrog Artifactory (files and folders)
# Auth: export JFROG_API_TOKEN=...
modelaudit https://company.jfrog.io/artifactory/repo/model.pt
modelaudit https://company.jfrog.io/artifactory/repo/models/

# DVC-tracked models
modelaudit model.dvc
```

### Authentication Environment Variables

- `HF_TOKEN` for private Hugging Face repositories
- `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` (and optional `AWS_SESSION_TOKEN`) for S3
- `GOOGLE_APPLICATION_CREDENTIALS` for GCS
- `MLFLOW_TRACKING_URI` for MLflow registry access
- `JFROG_API_TOKEN` or `JFROG_ACCESS_TOKEN` for JFrog Artifactory
- Store credentials in environment variables or a secrets manager, and never commit tokens/keys.

## Installation

```bash
# Everything (recommended)
pip install modelaudit[all]

# Core only (pickle, numpy, archives)
pip install modelaudit

# Specific frameworks
pip install modelaudit[tensorflow,pytorch,h5,onnx,safetensors]

# CI/CD environments
pip install modelaudit[all-ci]

# Docker
docker run --rm -v "$(pwd)":/app ghcr.io/promptfoo/modelaudit:latest model.pkl
```

## CLI Options

```
--format {text,json,sarif}   Output format (default: text)
--output FILE                Write results to file
--strict                     Fail on warnings, scan all file types
--sbom FILE                  Generate CycloneDX SBOM
--stream                     Download, scan, and delete files one-by-one (saves disk)
--max-size SIZE              Size limit (e.g., 10GB)
--timeout SECONDS            Override scan timeout
--dry-run                    Preview what would be scanned
--verbose / --quiet          Control output detail
--blacklist PATTERN          Additional patterns to flag
--no-cache                   Disable result caching
--cache-dir DIR              Set cache directory for downloads and scan results
--progress                   Force progress display
```

## Exit Codes

- `0`: No security issues detected
- `1`: Security issues detected
- `2`: Scan errors

## Telemetry and Privacy

ModelAudit includes telemetry for product reliability and usage analytics.

- Collected metadata can include command usage, scan timing, scanner/file-type usage, issue severity/type aggregates, and model path or URL identifiers.
- Model files are scanned locally and ModelAudit does not upload model binary contents as telemetry events.
- Telemetry is disabled automatically in CI/test environments and in editable development installs by default.

Opt out explicitly with either environment variable:

```bash
export PROMPTFOO_DISABLE_TELEMETRY=1
# or
export NO_ANALYTICS=1
```

To opt in during editable/development installs:

```bash
export MODELAUDIT_TELEMETRY_DEV=1
```

## Output Examples

```bash
# JSON for CI/CD pipelines
modelaudit model.pkl --format json --output results.json

# SARIF for code scanning platforms
modelaudit model.pkl --format sarif --output results.sarif
```

## Troubleshooting

- Run `modelaudit doctor --show-failed` to list unavailable scanners and missing optional deps.
- If `pip` installs an older release, verify Python is `3.10+` (`python --version`).
- For additional troubleshooting and cloud auth guidance, see:
  - https://www.promptfoo.dev/docs/model-audit/
  - https://www.promptfoo.dev/docs/model-audit/usage/

## Documentation

- **[Full docs](https://www.promptfoo.dev/docs/model-audit/)** — setup, configuration, and advanced usage
- **[Usage examples](https://www.promptfoo.dev/docs/model-audit/usage/)** — CI/CD integration, remote scanning, SBOM generation
- **[Supported formats](https://www.promptfoo.dev/docs/model-audit/scanners/)** — detailed scanner documentation
- **[Support policy](SUPPORT.md)** — supported Python/OS versions and maintenance policy
- **[Security model and limitations](docs/user/security-model.md)** — what ModelAudit does and does not guarantee
- **[Compatibility matrix](docs/user/compatibility-matrix.md)** — file formats vs optional dependencies
- **[Offline/air-gapped guide](docs/user/offline-air-gapped.md)** — secure operation without internet access
- **[Scanner contributor quickstart](docs/agents/new-scanner-quickstart.md)** — safe workflow for new scanner development
- **Troubleshooting** — run `modelaudit doctor --show-failed` to check scanner availability

## License

MIT License — see [LICENSE](LICENSE) for details.
