Metadata-Version: 2.4
Name: code-provenance
Version: 0.1.1
Summary: Resolve Docker images to their source code commits on GitHub
Author: SCRT Labs
License: MIT
Project-URL: Homepage, https://github.com/scrtlabs/code-provenance
Project-URL: Repository, https://github.com/scrtlabs/code-provenance
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Software Development :: Build Tools
Classifier: Topic :: System :: Software Distribution
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: pyyaml>=6.0
Requires-Dist: requests>=2.31
Requires-Dist: rich>=13.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"

# code-provenance

Resolve Docker images in a docker-compose file to their exact source code commits on GitHub.

## Installation

```bash
pip install code-provenance
```

Requires Python 3.10+.

## CLI Usage

```bash
code-provenance [compose-file] [--json] [--verbose]
```

- `compose-file` -- path to a docker-compose file (default: `docker-compose.yml`)
- `--json` -- output results as JSON
- `--verbose`, `-v` -- show resolution steps for each image

### Example

```bash
code-provenance docker-compose.yml
```

```
┌─────────┬────────────────┬────────────────────────────┬──────────────┬──────────┬────────────┐
│ SERVICE │ IMAGE          │ REPO                       │ COMMIT       │ STATUS   │ CONFIDENCE │
├─────────┼────────────────┼────────────────────────────┼──────────────┼──────────┼────────────┤
│ web     │ traefik:v3.6.0 │ github.com/traefik/traefik │ 06db5168c0d9 │ resolved │ exact      │
└─────────┴────────────────┴────────────────────────────┴──────────────┴──────────┴────────────┘
```

## Library Usage

```python
from code_provenance.compose_parser import parse_compose, parse_image_ref
from code_provenance.resolver import resolve_image

yaml_content = open("docker-compose.yml").read()
for service, image in parse_compose(yaml_content):
    ref = parse_image_ref(image)
    result = resolve_image(service, ref)
    print(f"{result.service}: {result.commit} ({result.confidence})")
```

## API Reference

### Functions

- `parse_compose(yaml_content: str) -> list[tuple[str, str]]` -- parse a docker-compose YAML string and return `(service_name, image_string)` pairs
- `parse_image_ref(image: str) -> ImageRef` -- parse a Docker image string into its components
- `resolve_image(service: str, ref: ImageRef) -> ImageResult` -- resolve an image reference to its source code commit

### ImageRef

| Field | Type | Description |
|-------|------|-------------|
| `registry` | `str` | e.g. `"ghcr.io"`, `"docker.io"` |
| `namespace` | `str` | e.g. `"myorg"`, `"library"` |
| `name` | `str` | e.g. `"traefik"`, `"postgres"` |
| `tag` | `str` | e.g. `"v3.6.0"`, `"latest"` |
| `raw` | `str` | original image string from docker-compose |

### ImageResult

| Field | Type | Description |
|-------|------|-------------|
| `service` | `str` | service name from docker-compose |
| `image` | `str` | original image string |
| `registry` | `str` | image registry |
| `repo` | `str \| None` | GitHub repository URL |
| `tag` | `str` | image tag |
| `commit` | `str \| None` | resolved commit SHA |
| `commit_url` | `str \| None` | URL to the commit on GitHub |
| `status` | `str` | `"resolved"`, `"repo_not_found"`, `"repo_found_tag_not_matched"`, or `"no_tag"` |
| `resolution_method` | `str \| None` | how the commit was resolved (e.g. `"oci_labels"`, `"tag_match"`) |
| `confidence` | `str \| None` | `"exact"` or `"approximate"` |
| `steps` | `list[str]` | resolution steps taken (useful with `--verbose`) |

## Authentication

Set `GITHUB_TOKEN` for full functionality (digest resolution, `:latest` on GHCR, higher rate limits):

```bash
export GITHUB_TOKEN=ghp_your_token_here
```

Create a classic token at https://github.com/settings/tokens with `read:packages` scope. If using the `gh` CLI, run `gh auth refresh -h github.com -s read:packages` first.

The `run.sh` wrapper auto-detects the token from `gh` CLI if available.

## License

MIT
