Metadata-Version: 2.4
Name: podstack
Version: 1.3.15
Summary: Official Python SDK for Podstack GPU Notebook Platform
Author-email: Podstack <support@podstack.ai>
License-Expression: MIT
Project-URL: Homepage, https://podstack.ai
Project-URL: Documentation, https://docs.podstack.ai
Project-URL: Repository, https://github.com/podstack/podstack-python
Project-URL: Issues, https://github.com/podstack/podstack-python/issues
Keywords: gpu,notebook,machine-learning,deep-learning,cloud,jupyter
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.24.0
Requires-Dist: requests>=2.28.0
Provides-Extra: torch
Requires-Dist: torch; extra == "torch"
Provides-Extra: tensorflow
Requires-Dist: tensorflow; extra == "tensorflow"
Provides-Extra: sklearn
Requires-Dist: scikit-learn; extra == "sklearn"
Provides-Extra: huggingface
Requires-Dist: transformers; extra == "huggingface"
Requires-Dist: safetensors; extra == "huggingface"
Provides-Extra: all
Requires-Dist: torch; extra == "all"
Requires-Dist: tensorflow; extra == "all"
Requires-Dist: scikit-learn; extra == "all"
Requires-Dist: transformers; extra == "all"
Requires-Dist: safetensors; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: ruff>=0.0.270; extra == "dev"
Dynamic: license-file

# Podstack Python SDK

Official Python SDK for the Podstack GPU Platform. Run ML workloads on remote GPUs with simple decorators, track experiments, and manage models.

## Installation

```bash
pip install podstack
```

With optional dependencies:

```bash
pip install podstack[torch]        # PyTorch support
pip install podstack[huggingface]  # HuggingFace Transformers
pip install podstack[all]          # All ML frameworks
```

## Quick Start

```python
import podstack

# Initialize the SDK
podstack.init(
    api_key="your-api-key",
    project_id="your-project-id"
)

# Run a function on a remote GPU with a single decorator
@podstack.gpu(type="L40S", fraction=100)
def train():
    import torch
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    return {"status": "done"}

result = train()  # Executes on remote GPU!
```

## Decorators & Annotations

Podstack provides decorators that turn any Python function into a remote GPU workload with built-in experiment tracking.

### `@podstack.gpu` - Remote GPU Execution

```python
import podstack

# Basic GPU execution
@podstack.gpu(type="L40S")
def train_model():
    import torch
    model = torch.nn.Linear(768, 10).cuda()
    return {"params": sum(p.numel() for p in model.parameters())}

result = train_model()

# Specify GPU type, count, and fraction
@podstack.gpu(type="A100-80G", count=2, fraction=100)
def train_large_model():
    import torch
    print(f"GPUs available: {torch.cuda.device_count()}")

# Install pip packages on the fly
@podstack.gpu(type="L40S", pip=["transformers", "datasets", "accelerate"])
def finetune_llm():
    from transformers import AutoModelForCausalLM, AutoTokenizer
    model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf")
    ...

# Use uv for faster package installation
@podstack.gpu(type="L40S", uv=["torch", "transformers"])
def fast_setup():
    ...

# Install from requirements.txt
@podstack.gpu(type="L40S", requirements="requirements.txt", use_uv=True)
def train_with_deps():
    ...

# Use conda packages
@podstack.gpu(type="L40S", conda="cudatoolkit=11.8")
def train_with_conda():
    ...

# Use a pre-built environment
@podstack.gpu(type="L40S", env="nlp")
def nlp_task():
    ...

# Set execution timeout (default: 3600s)
@podstack.gpu(type="L40S", timeout=7200)
def long_training():
    ...

# Disable remote execution (run locally for debugging)
@podstack.gpu(type="L40S", remote=False)
def debug_locally():
    print("This runs on your local machine")

# Use as a context manager
with podstack.gpu(type="A100-80G", count=2) as cfg:
    print(f"GPU config set: {cfg.type}")
```

**Available GPU types:** `T4`, `L4`, `A10`, `L40S`, `A100-40G`, `A100-80G`, `H100`

**Available environments:** `ml`, `nlp`, `cv`, `audio`, `tabular`, `rl`, `scientific`

### `@podstack.experiment` - Experiment Tracking

```python
import podstack

# As a decorator
@podstack.experiment(name="transformer-experiments")
def run_experiment():
    ...

# As a context manager
with podstack.experiment(name="transformer-experiments") as exp:
    print(f"Experiment ID: {exp.id}")
```

### `@podstack.run` - Run Tracking

Automatically tracks execution time and GPU configuration.

```python
import podstack

# As a decorator
@podstack.experiment(name="my-experiment")
@podstack.run(name="training-v1", track_gpu=True)
def train():
    podstack.registry.log_params({"lr": 0.001, "batch_size": 32})
    for epoch in range(10):
        loss = 1.0 / (epoch + 1)
        podstack.registry.log_metrics({"loss": loss}, step=epoch)

# As a context manager
with podstack.run(name="training-v1") as run:
    podstack.registry.log_params({"lr": 0.001})
    podstack.registry.log_metrics({"loss": 0.5}, step=1)
    print(f"Run ID: {run.id}")

# With tags
@podstack.run(name="ablation-study", tags={"variant": "no-dropout"})
def ablation():
    ...
```

### `@podstack.model` - Model Registration

```python
import podstack

# Register model after function completes
@podstack.experiment(name="my-experiment")
@podstack.run(name="training-v1")
@podstack.model.register(name="my-classifier")
def train_and_save():
    import torch
    model = torch.nn.Linear(768, 10)
    torch.save(model.state_dict(), "model.pt")
    podstack.registry.log_artifact("model.pt", "model")

# Promote model to production after validation
@podstack.model.promote(name="my-classifier", version=1, stage="production")
def validate_and_promote():
    # Run validation checks
    accuracy = 0.95
    assert accuracy > 0.90, "Model doesn't meet threshold"
```

### Combining Decorators

Stack decorators for a complete ML workflow:

```python
import podstack

podstack.init(api_key="your-api-key", project_id="your-project-id")

@podstack.gpu(type="L40S", pip=["transformers", "datasets"])
@podstack.experiment(name="sentiment-analysis")
@podstack.run(name="bert-finetune-v1", track_gpu=True)
@podstack.model.register(name="sentiment-bert")
def full_pipeline():
    from transformers import AutoModelForSequenceClassification, Trainer

    model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")

    # Log hyperparameters
    podstack.registry.log_params({
        "model": "bert-base-uncased",
        "learning_rate": 2e-5,
        "epochs": 3
    })

    # Train...
    podstack.registry.log_metrics({"accuracy": 0.92, "f1": 0.89})

    return {"accuracy": 0.92}

result = full_pipeline()  # Runs on remote L40S GPU with full tracking
```

## Registry - Experiment Tracking & Model Management

### Initialize

```python
from podstack import registry

registry.init(
    api_key="your-api-key",
    project_id="your-project-id"
)
```

### Track Experiments and Runs

```python
from podstack import registry

# Set experiment
registry.set_experiment("my-experiment")

# Start a tracked run
with registry.start_run(name="training-v1") as run:
    # Log hyperparameters
    registry.log_params({
        "learning_rate": 0.001,
        "batch_size": 32,
        "epochs": 10,
        "optimizer": "adam"
    })

    # Log metrics at each step
    for epoch in range(10):
        loss = train_epoch()
        accuracy = evaluate()
        registry.log_metrics({"loss": loss, "accuracy": accuracy}, step=epoch)

    # Set tags
    registry.set_tag("framework", "pytorch")

    # Upload artifacts to cloud artifact store
    registry.log_artifact("model.pt")
    registry.log_artifact("training_curves.png", artifact_path="plots/curves.png")

    # Log dataset provenance (first-class resource, deduped by content hash)
    registry.log_dataset("imdb-reviews", path="data/imdb.csv", context="training")

    # Or pass a DataFrame — schema and row/feature counts are auto-computed
    import pandas as pd
    df = pd.read_csv("data/imdb.csv")
    registry.log_dataset("imdb-reviews", df=df, context="training")
```

### Log and Load Models

```python
from podstack import registry

# Serialize and upload the model to the artifact store (auto-detects framework)
registry.log_model(model, artifact_path="model", framework="pytorch")

# Register in model registry
registry.register_model(
    name="my-classifier",
    run_id=run.id,
    description="BERT sentiment classifier"
)

# Promote to production
registry.set_model_stage("my-classifier", version=1, stage="production")

# Set aliases
registry.set_model_alias("my-classifier", alias="champion", version=1)

# Load model from any machine — files are downloaded automatically if missing locally
model = registry.load_model("my-classifier", stage="production")
```

### Compare Runs

```python
from podstack import registry

# Compare multiple runs
comparison = registry.compare_runs(
    run_ids=["run-id-1", "run-id-2", "run-id-3"],
    metric_keys=["loss", "accuracy"]
)

# Get metric history for a run
history = registry.get_metric_history("run-id-1", "loss")
for point in history:
    print(f"Step {point.step}: {point.value}")

# Search runs
runs = registry.search_runs(
    experiment_id="exp-id",
    status="completed",
    max_results=50
)
```

### Dataset Tracking & Lineage

Podstack tracks datasets as first-class resources, linking them to runs and model versions so you can always answer *"what data was this model trained on?"*

The lineage chain is:

```
Dataset(s) ──[logged to]──▶ Run ──[run_id]──▶ ModelVersion
```

#### `log_dataset()` — log a dataset to the active run

```python
dataset = registry.log_dataset(
    name="imdb-reviews",          # required — human-readable name
    path="data/imdb.csv",         # local path or URI (s3://, gcs://, https://)
    context="training",           # "training" | "validation" | "test" (default: "training")
)
```

The dataset is stored as a **project-level resource** and linked to the current run.
Subsequent calls with the same file produce the same dataset record — no duplicates.

**Auto-enrichment from a local file:**

```python
# SHA-256 digest is computed automatically for files ≤ 500 MB.
# This enables deduplication across runs — if two runs use the exact
# same file, they share one Dataset record in the registry.
dataset = registry.log_dataset("imdb-reviews", path="data/imdb.csv")
print(dataset.digest)  # "a3f2c1..." — hex SHA-256
```

**Auto-enrichment from a pandas DataFrame:**

```python
import pandas as pd

df = pd.read_csv("data/imdb.csv")

dataset = registry.log_dataset(
    name="imdb-reviews",
    df=df,
    context="training",
)
# schema and profile are computed automatically:
print(dataset.schema)   # {"text": "object", "label": "int64"}
print(dataset.profile)  # {"num_rows": 50000, "num_features": 2}
```

**Pass both `path` and `df`** to get digest dedup *and* schema inference:

```python
dataset = registry.log_dataset("imdb-reviews", path="data/imdb.csv", df=df)
```

**All parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | `str` | required | Human-readable dataset name |
| `path` | `str` | `None` | Local file path or URI (`s3://`, `gcs://`, `https://`) |
| `df` | `DataFrame` | `None` | pandas DataFrame — schema and profile auto-computed |
| `context` | `str` | `"training"` | Role of the dataset: `"training"`, `"validation"`, or `"test"` |
| `digest` | `str` | `None` | SHA-256 hex digest. Computed from `path` if not provided |
| `source_type` | `str` | `"local"` | Storage backend: `"local"`, `"s3"`, `"gcs"`, `"url"` |
| `tags` | `dict` | `None` | Arbitrary string key-value tags |

**Returns:** `Dataset` object with fields:

| Field | Type | Description |
|-------|------|-------------|
| `id` | `str` | UUID of the dataset record |
| `name` | `str` | Dataset name |
| `digest` | `str` | SHA-256 hex digest (empty if not computed) |
| `source_type` | `str` | Storage backend |
| `source` | `str` | File path or URI |
| `schema` | `dict` | Column → dtype mapping |
| `profile` | `dict` | `num_rows`, `num_features`, and any other stats |
| `tags` | `dict` | Tags dict |
| `created_at` | `str` | ISO 8601 timestamp |

**Via the `Run` object** (equivalent to calling `registry.log_dataset()`):

```python
with registry.start_run("training-v1") as run:
    dataset = run.log_dataset("imdb-reviews", df=df, context="training")
```

#### Multiple datasets per run

Log validation and test sets alongside the training set:

```python
with registry.start_run("bert-finetune") as run:
    run.log_dataset("imdb-train", df=train_df, context="training")
    run.log_dataset("imdb-val",   df=val_df,   context="validation")
    run.log_dataset("imdb-test",  df=test_df,  context="test")
```

#### `get_run_datasets()` — retrieve datasets logged to a run

Returns every `Dataset` object linked to a run, in the order they were logged.

```python
datasets = registry.get_run_datasets(run_id)
```

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `run_id` | `str` | ID of the run to query |

**Returns:** `list[Dataset]` — same object as returned by `log_dataset()`.

**Fields on each `Dataset`:**

| Field | Type | Description |
|-------|------|-------------|
| `id` | `str` | UUID of the dataset record |
| `name` | `str` | Human-readable name |
| `digest` | `str` | SHA-256 hex digest (empty if not computed at log time) |
| `source_type` | `str` | `"local"`, `"s3"`, `"gcs"`, or `"url"` |
| `source` | `str` | File path or URI that was passed to `log_dataset()` |
| `schema` | `dict` | Column → dtype mapping (e.g. `{"text": "object", "label": "int64"}`) |
| `profile` | `dict` | Stats dict, always contains `num_rows` and `num_features` when a DataFrame was passed |
| `tags` | `dict` | Key-value tags |
| `created_at` | `str` | ISO 8601 timestamp |

**Examples:**

```python
from podstack import registry

registry.init(api_key="...", project_id="...")

datasets = registry.get_run_datasets("3a9f12c4-...")

# Inspect each dataset
for ds in datasets:
    print(ds.name)
    print(f"  source : {ds.source}")
    print(f"  digest : {ds.digest[:16]}…")
    print(f"  rows   : {ds.profile.get('num_rows', 'unknown')}")
    print(f"  schema : {ds.schema}")
```

Checking datasets on a run you have in hand:

```python
with registry.start_run("training-v1") as run:
    run.log_dataset("train", df=train_df, context="training")
    run.log_dataset("val",   df=val_df,   context="validation")

# After the run completes, retrieve everything that was logged
datasets = registry.get_run_datasets(run.id)
assert len(datasets) == 2
```

Verifying deduplication — the same physical file logged across two runs
returns the same dataset ID:

```python
ds1 = registry.get_run_datasets(run_a.id)[0]
ds2 = registry.get_run_datasets(run_b.id)[0]

# Same file → same digest → same Dataset record
assert ds1.id == ds2.id
assert ds1.digest == ds2.digest
```

#### `get_model_lineage()` — trace a model back to its training data

Returns the full provenance chain for every version of a registered model:
which datasets each version was trained on, via which run.

```python
lineage = registry.get_model_lineage(model_id)
```

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `model_id` | `str` | ID of the registered model |

**Returns:** `dict` with the following structure:

```
{
  "model_id": str,
  "versions": [
    {
      "version":  int,        # version number (1, 2, 3 …)
      "stage":    str,        # "development" | "staging" | "production" | "archived"
      "run_id":   str,        # ID of the linked training run (empty if none)
      "run_name": str,        # display name of the run
      "datasets": [Dataset]   # list of Dataset dicts logged to that run
    },
    …
  ]
}
```

Each `datasets` entry has the same fields as a `Dataset` object
(`id`, `name`, `digest`, `source_type`, `source`, `schema`, `profile`, `tags`, `created_at`).

**Examples:**

Basic iteration:

```python
from podstack import registry

registry.init(api_key="...", project_id="...")

model   = registry.get_model("sentiment-bert")
lineage = registry.get_model_lineage(model.id)

for version in lineage["versions"]:
    print(f"v{version['version']} · {version['stage']}")
    print(f"  Run: {version['run_name']} ({version['run_id'][:8]}…)")
    for ds in version["datasets"]:
        rows = ds["profile"].get("num_rows", "?")
        print(f"  └─ {ds['name']}  {rows} rows  sha256:{ds['digest'][:12]}…")
```

Example output:

```
v3 · production
  Run: bert-finetune-v3 (3a9f12c4…)
  └─ imdb-train  40000 rows  sha256:a3f2c1d8e9b0…
  └─ imdb-val     5000 rows  sha256:7e4b2f1a0c3d…
v2 · staging
  Run: bert-finetune-v2 (8b2e77d1…)
  └─ imdb-train  40000 rows  sha256:a3f2c1d8e9b0…
v1 · archived
  Run: bert-finetune-v1 (f1c3a0e2…)
  └─ imdb-train  40000 rows  sha256:a3f2c1d8e9b0…
```

Finding every unique dataset ever used to train any version of a model:

```python
lineage  = registry.get_model_lineage(model.id)
seen     = {}
for version in lineage["versions"]:
    for ds in version["datasets"]:
        seen[ds["id"]] = ds  # dedup by ID

unique_datasets = list(seen.values())
print(f"{len(unique_datasets)} unique dataset(s) across all versions")
```

Checking whether the production version was trained on an approved dataset:

```python
APPROVED_DIGEST = "a3f2c1d8e9b0..."

lineage = registry.get_model_lineage(model.id)
prod = next(v for v in lineage["versions"] if v["stage"] == "production")

approved = any(ds["digest"] == APPROVED_DIGEST for ds in prod["datasets"])
print("Production model trained on approved data:", approved)
```

#### End-to-end example

```python
import pandas as pd
from podstack import registry

registry.init(api_key="...", project_id="...")
registry.set_experiment("sentiment-analysis")

# Load data
train_df = pd.read_csv("data/train.csv")
val_df   = pd.read_csv("data/val.csv")

with registry.start_run("bert-finetune-v3") as run:
    # Log datasets — digest is auto-computed, schema inferred
    run.log_dataset("imdb-train", path="data/train.csv", df=train_df, context="training")
    run.log_dataset("imdb-val",   path="data/val.csv",   df=val_df,   context="validation")

    # Train
    run.log_params({"lr": 2e-5, "epochs": 3})
    run.log_metrics({"accuracy": 0.93, "f1": 0.92})

# Register and promote the model
registry.register_model("sentiment-bert", run_id=run.id)
registry.set_model_stage("sentiment-bert", version=3, stage="production")

# Later — answer "what data trained v3?"
model = registry.get_model("sentiment-bert")
lineage = registry.get_model_lineage(model.id)
```

### Artifact Storage

Podstack stores every artifact you log — model files, plots, CSV exports, anything — in the project's cloud artifact store. Artifacts are keyed by run ID, so the same file can be retrieved from any machine, by any project member, at any time.

#### `log_artifact()` — upload a file for the active run

```python
# Upload a single file (uses the filename as the artifact path)
registry.log_artifact("model.pt")

# Upload with an explicit path inside the artifact store
registry.log_artifact("training_curves.png", artifact_path="plots/curves.png")
registry.log_artifact("feature_importance.csv", artifact_path="analysis/features.csv")
```

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `local_path` | `str` | required | Path to the local file to upload |
| `artifact_path` | `str` | filename | Relative path inside the artifact store. Defaults to `os.path.basename(local_path)` |

If the artifact store is temporarily unreachable, the SDK saves the file to a local fallback cache (`~/.podstack/artifacts/<run_id>/`) so your run is never interrupted.

**Via the `Run` object** — equivalent to calling `registry.log_artifact()`:

```python
with registry.start_run("training-v1") as run:
    run.log_artifact("confusion_matrix.png", artifact_path="plots/confusion_matrix.png")
    run.log_artifact("model.pkl")
```

#### `list_artifacts()` — list all artifacts for a run

```python
artifacts = registry.list_artifacts(run_id)
for a in artifacts:
    print(f"{a['path']:40s}  {a['size'] / 1e6:.1f} MB  {a['last_modified']}")
```

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `run_id` | `str` | ID of the run to query |

**Returns:** `list[dict]` — one entry per artifact:

| Key | Type | Description |
|-----|------|-------------|
| `path` | `str` | Relative artifact path (e.g. `"plots/curves.png"`) |
| `size` | `int` | File size in bytes |
| `etag` | `str` | Content hash for integrity verification |
| `last_modified` | `str` | ISO 8601 upload timestamp |

#### `download_artifact()` — retrieve an artifact

Downloads a specific artifact from the cloud store into a local directory. Falls back to the local cache when the store is unreachable.

```python
# Download a single file
dest = registry.download_artifact("run-id", "model/model.pkl", "./downloads/")
print(f"Saved to: {dest}")

# Download a whole model directory
dest = registry.download_artifact("run-id", "model", "./local_models/")
```

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `run_id` | `str` | ID of the run that logged the artifact |
| `artifact_path` | `str` | Relative artifact path as logged (e.g. `"model/model.pkl"`) |
| `local_path` | `str` | Destination directory |

**Returns:** `str` — absolute path to the downloaded file or directory.

**Raises:** `ArtifactNotFoundError` if the artifact cannot be found in the store or the local cache.

#### Models as artifacts: `log_model()` and `load_model()`

`log_model()` serializes your model to disk and uploads every resulting file to the artifact store in one call. `load_model()` resolves the registered model version, downloads any missing files from the store, then deserializes the model — so it works correctly from any machine regardless of where training happened.

```python
# ── Training machine ──────────────────────────────────────────────────────────
with registry.start_run("bert-finetune-v3") as run:
    # train...
    registry.log_model(model, artifact_path="model", framework="pytorch")

registry.register_model("sentiment-bert", run_id=run.id)
registry.set_model_stage("sentiment-bert", version=3, stage="production")

# ── Any machine (CI, inference server, colleague's laptop) ───────────────────
# Model files are downloaded automatically from the artifact store if not cached
model = registry.load_model("sentiment-bert", stage="production")
```

**`log_model()` parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model` | any | required | Model object (PyTorch, TensorFlow, sklearn, HuggingFace, or any picklable object) |
| `artifact_path` | `str` | `"model"` | Sub-path inside the artifact store |
| `framework` | `str` | auto-detected | `"pytorch"`, `"tensorflow"`, `"sklearn"`, `"huggingface"`, or `"pickle"` |
| `metadata` | `dict` | `None` | Arbitrary key-value metadata stored as run params |

**`load_model()` parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model_name` | `str` | required | Registered model name |
| `version` | `int` | `None` | Specific version to load. Mutually exclusive with `stage` |
| `stage` | `str` | `None` | Stage to load from: `"development"`, `"staging"`, `"production"`, `"archived"` |
| `framework` | `str` | from run params | Override framework for deserialization |

#### Viewing artifacts in the dashboard

Every artifact logged with `log_artifact()` or `log_model()` appears automatically in the **Artifacts tab** of the run's detail page in the Podstack dashboard. No extra steps are needed — the tab populates from the same store the SDK writes to.

The Artifacts tab shows:

| Column | Description |
|--------|-------------|
| **Path** | The relative artifact path as logged (e.g. `model/model.pkl`, `plots/curves.png`) |
| **Type badge** | File extension, color-coded by category — model weights, data files, images, configs, etc. |
| **Size** | Formatted file size (B / KB / MB) |
| **Uploaded** | Timestamp of when the file was stored |
| **Download** | One-click download button — opens a short-lived direct download link in the browser |

A footer below the list shows the combined size of all artifacts for the run.

```python
# Everything logged here shows up in the dashboard Artifacts tab
with registry.start_run("bert-finetune-v3") as run:
    registry.log_params({"lr": 2e-5, "epochs": 3})
    registry.log_metrics({"accuracy": 0.93})

    # These all appear as separate rows in the Artifacts tab
    registry.log_artifact("confusion_matrix.png", artifact_path="plots/confusion_matrix.png")
    registry.log_artifact("feature_importance.csv", artifact_path="analysis/features.csv")
    registry.log_model(model, artifact_path="model", framework="pytorch")
    # ↳ each model file (model.pkl, config.json, etc.) appears as its own row
```

#### Access control

Artifact upload and download URLs are issued by the registry API and require a valid API key and project membership. The URLs are short-lived, ensuring that access always reflects the current state of your project — a revoked key can no longer generate new URLs. Any member of a project can upload and download artifacts for runs within that project.

### List and Browse

```python
from podstack import registry

# List experiments
experiments = registry.list_experiments()

# List models
models = registry.list_models()

# List artifacts for a specific run
artifacts = registry.list_artifacts(run_id)

# Download a specific artifact to a local directory
dest = registry.download_artifact("run-id", "model/model.pt", "./downloads/")
print(f"Saved to: {dest}")
```

## GPU Runner - Direct Code Execution

For running code strings directly on GPUs without decorators:

```python
import podstack

podstack.init(api_key="your-api-key", project_id="your-project-id")

# Run code on a remote GPU
result = podstack.run_on_gpu('''
import torch
print(f"GPU: {torch.cuda.get_device_name(0)}")
print(f"Memory: {torch.cuda.get_device_properties(0).total_mem / 1e9:.1f} GB")
''', gpu="L40S")

print(result.output)
print(f"Success: {result.success}")
print(f"Duration: {result.duration_seconds}s")
```

## Client API

For direct API access to notebooks and executions:

```python
from podstack import Client

client = Client(api_key="your-api-key")

# Create a notebook
notebook = client.sync_create_notebook(name="experiment", gpu_type="L40S")
print(f"JupyterLab: {notebook.jupyter_url}")

# Run code
result = client.sync_run("print('Hello GPU!')", gpu_type="L40S")
print(result.output)
```

## Error Handling

```python
from podstack import (
    PodstackError,
    AuthenticationError,
    GPUNotAvailableError,
    RateLimitError,
    ExecutionTimeoutError
)

try:
    result = train()
except AuthenticationError:
    print("Invalid API key")
except GPUNotAvailableError as e:
    print(f"GPU not available")
except RateLimitError as e:
    print(f"Rate limited, retry after {e.retry_after}s")
except ExecutionTimeoutError as e:
    print(f"Execution timed out: {e.execution_id}")
except PodstackError as e:
    print(f"Error: {e.message}")
```

## Configuration

```python
import podstack

# Option 1: Initialize explicitly
podstack.init(
    api_key="your-api-key",
    project_id="your-project-id",
    api_url="https://api.podstack.ai/v1",       # optional
    registry_url="https://registry.podstack.ai"  # optional
)

# Option 2: Environment variables
# PODSTACK_API_KEY=your-api-key
# PODSTACK_PROJECT_ID=your-project-id
# PODSTACK_API_URL=https://api.podstack.ai/v1
# PODSTACK_REGISTRY_URL=https://registry.podstack.ai

# Option 3: Auto-init (set PODSTACK_AUTO_INIT=1)
# SDK auto-initializes from env vars at import time
```

## License

MIT License - see LICENSE for details.
