Metadata-Version: 2.4
Name: pathmorph
Version: 0.1.0
Summary: Invertible directory structure transformations with schema-driven path remapping
Project-URL: Homepage, https://github.com/yourusername/pathmorph
Project-URL: Issues, https://github.com/yourusername/pathmorph/issues
License: MIT
License-File: LICENSE
Keywords: data-management,filesystem,pipeline,reproducibility,schema
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: System :: Filesystems
Classifier: Topic :: Utilities
Requires-Python: >=3.10
Requires-Dist: confuk>=0.4.0
Requires-Dist: omegaconf>=2.3.0
Requires-Dist: rich>=13.0.0
Requires-Dist: typer>=0.12.0
Description-Content-Type: text/markdown

# pathmorph

**Invertible directory structure transformations.**

`pathmorph` lets you remap a directory layout to a different schema and
restore the original at any time — with zero external state beyond a small
JSON manifest embedded in the packed directory.

It is designed to be:

- **Composable** — fits naturally into shell pipelines and Makefiles
- **Reproducible** — the manifest is a ground-truth record of every file move
- **Format-agnostic** — schemas can be written in YAML, TOML, or JSON
- **Dependency-light** — `confuk`, `omegaconf`, `rich`, `typer`; nothing heavier

---

## Motivation

Pipeline tools tend to emit outputs in machine-friendly directory structures:
short IDs, flat hierarchies, terse filenames. Collaborators often expect
something different: descriptive names, nested grouping, human-readable paths.

`pathmorph` lets you have both. Define the mapping as a schema, `pack` when
you need the human layout, `unpack` when you need the original back.

---

## Installation

```bash
pip install pathmorph

# Optional: faster hashing for large directories
pip install pathmorph[xxhash]
```

---

## Quickstart

**1. Write a schema** (`my_schema.yaml`):

```yaml
schema:
  name: human_v1
  description: "Friendly layout for collaborators"
  fallback: passthrough   # or: omit/cram

  rules:
    - pattern: "runs/(?P<exp>[^/]+)/(?P<variant>[^/]+)/scores\\.tsv"
      target:  "experiments/{exp}/candidates/{variant}/developability_scores.tsv"

    - pattern: "runs/(?P<exp>[^/]+)/(?P<variant>[^/]+)/(?P<rest>.+)"
      target:  "experiments/{exp}/candidates/{variant}/{rest}"
```

Each rule is a Python `re` **full-match** pattern with named capture groups.
The `target` is a format string that references those groups via `{name}`.

Rules are evaluated in order — the **first match wins**.

`fallback` controls what happens when no rule matches:
- `passthrough` — file is copied as-is (default)
- `omit` — file is skipped entirely
- `cram` – file is crammed into the subdirectory path specified in the `crampath` variable in the schema

**2. Preview the mapping** (dry-run, no filesystem changes):

```bash
pathmorph diff ./pipeline_output -s my_schema.yaml
```

**3. Pack into the human layout**:

```bash
pathmorph pack ./pipeline_output -d ./for_collaborators -s my_schema.yaml
```

This copies files (non-destructive by default) and writes
`./for_collaborators/.pathmorph_manifest.json`.

**4. Restore the original layout**:

```bash
pathmorph unpack ./for_collaborators ./restored_original
```

**5. Verify integrity**:

```bash
pathmorph verify ./for_collaborators
```

---

## Command reference

### `pack`

```
pathmorph pack SRCS -d/--dst DST -s/--schema SCHEMA [OPTIONS]

Options:
  --move                   Move files instead of copying
  --handle-existing        abort | skip | overwrite
                           (prompts interactively if omitted)
  --hash ALGO              sha256 (default), sha1, md5, blake2b,
                           xxh64, xxh128, xxh3_64, xxh3_128
```

### `unpack`

```
pathmorph unpack PACKED_DIR DST [OPTIONS]

Options:
  --move                   Move files instead of copying
  --handle-existing        abort | skip | overwrite
```

### `diff`

```
pathmorph diff SRCS -s/--schema SCHEMA [OPTIONS]

Options:
  --show-passthrough / --hide-passthrough   (default: show)
```

### `verify`

```
pathmorph verify PACKED_DIR

Exit code 0 on success, 1 if any file fails.
```

---

## Schema formats

`pathmorph` accepts any format supported by
[confuk](https://github.com/kjczarne/confuk): `.yaml`, `.yml`, `.toml`, `.json`.

See [`examples/`](examples/) for the same schema in YAML and TOML.

---

## Manifest

A packed directory is self-describing. The manifest at
`.pathmorph_manifest.json` records:

```json
{
  "version": 1,
  "schema_name": "human_v1",
  "schema_description": "...",
  "packed_at": "2026-04-29T...",
  "algorithm": "sha256",
  "entries": [
    {
      "original": "runs/exp_001/A/scores.tsv",
      "packed":   "experiments/exp_001/candidates/A/developability_scores.tsv",
      "hash":     "abc123...",
      "algorithm": "sha256"
    }
  ]
}
```

`unpack` uses the entry table directly — it does **not** re-evaluate schema
rules. This means the manifest is a durable, schema-version-independent
record: even if you later change the schema, existing packed directories
remain unpackable.

---

## Using as a library

```python
from pathlib import Path
from pathmorph import pack, unpack, verify, diff
from pathmorph.schemas import Schema
from pathmorph.collision import CollisionStrategy

schema = Schema.from_file(Path("my_schema.yaml"))

# Dry-run
records = diff(Path("./pipeline_output"), schema)

# Pack
result = pack(
    Path("./pipeline_output"),
    Path("./for_collaborators"),
    schema=schema,
    collision=CollisionStrategy.SKIP,
    hash_algorithm="sha256",
)

# Verify
verify_result = verify(Path("./for_collaborators"))
assert verify_result.passed

# Restore
unpack(Path("./for_collaborators"), Path("./restored"))
```

---

## Development

```bash
git clone https://github.com/yourusername/pathmorph
cd pathmorph
pip install -e ".[dev]"
pytest
```

---

## License

MIT
