Metadata-Version: 2.4
Name: pyprocessors-pseudonimizer
Version: 1.6.27
Summary: Processor based on Presidio anonymizer
Project-URL: Homepage, https://kairntech.com/
Author-email: Olivier Terrier <olivier.terrier@kairntech.com>
License: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.12
Requires-Dist: faker
Requires-Dist: log-with-context
Requires-Dist: presidio-anonymizer>=2.2.29
Requires-Dist: pymultirole-plugins<0.7.0,>=0.6.0
Provides-Extra: dev
Requires-Dist: pre-commit; extra == 'dev'
Provides-Extra: docs
Requires-Dist: lxml-html-clean; extra == 'docs'
Requires-Dist: m2r2; extra == 'docs'
Requires-Dist: sphinx; extra == 'docs'
Requires-Dist: sphinx-rtd-theme; extra == 'docs'
Requires-Dist: sphinxcontrib-apidoc; extra == 'docs'
Provides-Extra: sbom
Requires-Dist: cyclonedx-bom; extra == 'sbom'
Requires-Dist: pip-audit; extra == 'sbom'
Provides-Extra: test
Requires-Dist: dirty-equals; extra == 'test'
Requires-Dist: pytest; extra == 'test'
Requires-Dist: pytest-cov; extra == 'test'
Requires-Dist: ruff; extra == 'test'
Description-Content-Type: text/markdown

# pyprocessors-pseudonimizer

Pseudonymization processor based on [Microsoft Presidio](https://microsoft.github.io/presidio/anonymizer/).

## Requirements

- Python 3.12+
- [uv](https://github.com/astral-sh/uv) for package management

## Installation

```bash
pip install pyprocessors-pseudonimizer
```

Or with uv:

```bash
uv add pyprocessors-pseudonimizer
```

## Development setup

```bash
uv sync --extra test
```

## Running tests

```bash
uv run pytest
```

## Linting

```bash
uv run ruff check .
uv run ruff format --check .
```

## Publishing

```bash
uv build
uv publish
```

## Operators

The processor supports the following anonymization operators:

| Operator   | Description                                              |
|------------|----------------------------------------------------------|
| `mask`     | Replaces characters with a masking character             |
| `replace`  | Replaces the entity with a fixed value                   |
| `redact`   | Removes the entity completely from the text              |
| `label`    | Replaces the entity with its label name (e.g. `<person>`) |
| `identity` | Leaves the entity unchanged                              |
| `faker`    | Replaces the entity with a fake value from [Faker](https://faker.readthedocs.io/) |

## SBOM & vulnerability check

Install the SBOM dependencies:

```
uv sync --extra sbom
```

Generate a CycloneDX SBOM from the current environment:

```
uv run cyclonedx-py environment -o sbom.cdx.json --output-format json
```

Audit dependencies for known vulnerabilities:

```
uv run pip-audit --format json --output audit-report.json
```

To fail on any known vulnerability (useful in CI):

```
uv run pip-audit --strict
```
