Metadata-Version: 2.4
Name: micantis
Version: 1.1.0
Summary: Package to simplify Micantis API usage
Author-email: Mykela DeLuca <mykela.deluca@micantis.io>
License-Expression: MIT
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests
Requires-Dist: pandas
Requires-Dist: msal
Requires-Dist: azure-identity
Provides-Extra: parquet
Requires-Dist: pyarrow>=16.0.0; extra == "parquet"
Dynamic: license-file

# Micantis API Wrapper

A lightweight Python wrapper for interacting with the Micantis API plus some helpful utilities.  
Built for ease of use, fast prototyping, and clean integration into data workflows.

---

## 🚀 Features

- Authenticate and connect to the Micantis API service
- Download and parse CSV, binary, and Parquet data into pandas DataFrames
- Parquet support for efficient data storage with embedded metadata
- Filter, search, and retrieve metadata
- Stitch and clean data sets programmatically
- Track data changes with the changelog endpoint
- Find duplicate files in your data library
- Upload artifacts from Python script executions
- Monitor and manage async background jobs with built-in polling
- All methods support a `timeout` parameter for fine-grained control
- Automatic session recovery — expired tokens are re-authenticated transparently

---

## ⚠️ Important

This package is designed for authenticated Micantis customers only.  
If you are not a Micantis customer, the API wrapper and utilities in this package will not work for you.

For more information on accessing the Micantis API, please contact us at info@micantis.io.

---

## 📦 Installation

```bash
pip install micantis
```

### Optional: Parquet Support

For parquet file downloads and metadata extraction, install with parquet support:

```bash
pip install micantis[parquet]
```

Or install `pyarrow` separately:

```bash
pip install pyarrow
```

---

## 💻 Examples

### Import functions

``` python
import pandas as pd
from micantis import MicantisAPI
```

### Initialize API

``` python
# Option 1 - login with username and password
service_url = 'your service url'
username = 'your username'
password = 'your password'

api = MicantisAPI(service_url=service_url, username=username, password=password)
```

``` python
# Option 2 - login in with Microsoft Entra ID
SERVICE = 'your service url'
CLIENT_ID = 'your client id'
AUTHORITY = 'https://login.microsoftonline.com/organizations'
SCOPES = ['your scopes']

api = MicantisAPI(service_url=SERVICE, client_id=CLIENT_ID, authority=AUTHORITY, scopes=SCOPES)
```

``` python
# Option 3 - Use pre-existing token (for containerized apps)
service_url = 'your service url'
token = 'your bearer token'

api = MicantisAPI(service_url=service_url, token=token)
# No need to call authenticate() when using a token
```

``` python
# Option 4 - Use environment variables (recommended for containers/CI/CD)
# Set these environment variables:
# - WORKBOOK_API_URL: your service URL
# - WORKBOOK_TOKEN: your bearer token

api = MicantisAPI()  # Will automatically use environment variables
# No need to call authenticate() when using a token
```

### Authenticate API
``` api.authenticate() ```

**Note:** When using a pre-existing token (Option 3 or 4), you don't need to call `authenticate()` as the token is already configured.

### Download Data Table Summary

#### Optional parameters
- `search`: Search string (same syntax as the Micantis WebApp)
- `barcode`: Filter by barcode
- `limit`: Number of results to return (default: 500)
- `show_ignored`: Include soft-deleted files (default: `True`)
- `min_start_date`: Only return results whose test started on or after this date (ISO format)
- `max_start_date`: Only return results whose test started on or before this date (ISO format)
- `min_end_date`: Only return results whose test ended on or after this date (ISO format)
- `max_end_date`: Only return results whose test ended on or before this date (ISO format)
- `station`: Filter by test station name
- `channel`: Filter by channel
- `cell_test_name`: Filter by cell test name
- `data_kind`: Filter by data type (e.g. `"CycleTesterData"`, `"StitchedData"`, `"PotentiostatData"`, `"FileData"`)

```python
# Basic search
table = api.get_data_table(search="your search string", limit=100)

# Filter by start date range
table = api.get_data_table(min_start_date="2025-01-01", max_start_date="2025-06-01")

# Filter by end date range
table = api.get_data_table(min_end_date="2025-01-01", max_end_date="2025-06-01")

# Filter by station and data type
table = api.get_data_table(station="your station name", data_kind="CycleTesterData")

# Filter by cell test name
table = api.get_data_table(cell_test_name="your cell test name")

# Combine multiple filters
table = api.get_data_table(
    min_start_date="2025-01-01",
    max_start_date="2025-12-31",
    data_kind="CycleTesterData",
    search="your search string",
    limit=100
)
```

### Download Binary Files

``` python
# Download single file

file_id = 'File ID obtained from data table, id column'
df = api.download_binary_file(id)

```

``` python
# Download many files using list of files from the table

file_id_list = table['id'].to_list()
data = []

for id in file_id_list:
    df = api.download_csv_file(id)
    data.append(df)

all_data = pd.concat(data)
```

### Download CSV Files

``` python
# Download single file

file_id = 'File ID obtained from data table, id column'
df = api.download_csv_file(id)
```

``` python
# Download multiple files

id_list = table['id'].to_list()
data = []

for id in id_list:
    df = api.download_csv_file(id)
    data.append(df)

all_data = pd.concat(data)
```

### Download Parquet Files

Download cycle tester data as Apache Parquet files for efficient analysis. Parquet files are smaller, faster, and include embedded metadata.

#### Optional parameters
- `cycle_ranges`: Filter by cycle index (see examples below)
- `test_time_start`: Filter by test time start (seconds from test start)
- `test_time_end`: Filter by test time end (seconds from test start)
- `line_number_start`: Filter by line number start
- `line_number_end`: Filter by line number end
- `include_auxiliary_data`: Include auxiliary channels like temperature (default: `True`)
- `output_path`: Custom file path (default: uses cell_data_id as filename)
- `return_type`: What to return - `'dataframe'` (default), `'path'`, or `'bytes'`

#### Return Type Options
- **`'dataframe'`** (default): Saves file and returns pandas DataFrame - best for immediate analysis
- **`'dict'`**: Saves file and returns dict with data, metadata, and cycle_summaries - best when you need metadata (requires `pyarrow`)
- **`'path'`**: Saves file and returns path string - best for large files or batch processing
- **`'bytes'`**: Returns raw bytes without saving - best for direct cloud uploads (Databricks, Azure Blob, S3)

```python
# Download and get DataFrame (default)
file_id = 'File ID obtained from data table, id column'
df = api.download_parquet_file(file_id)
```

```python
# Get data + metadata in one call
result = api.download_parquet_file(file_id, return_type='dict')

df = result['data']                    # Cycle test data
metadata = result['metadata']          # Cell metadata (name, barcode, timestamps, etc.)
cycle_summaries = result['cycle_summaries']  # Per-cycle summary statistics
```

```python
# Save file and get path (memory efficient for large files)
path = api.download_parquet_file(file_id, return_type='path')

# Later, read when needed
df = pd.read_parquet(path)
```

```python
# Get raw bytes for direct cloud upload (no local file)
parquet_bytes = api.download_parquet_file(file_id, return_type='bytes')

# Upload to Azure Blob Storage
blob_client.upload_blob(name='test_data.parquet', data=parquet_bytes)

# Or read directly into DataFrame
import io
df = pd.read_parquet(io.BytesIO(parquet_bytes))
```

#### Cycle Range Filtering

Filter data by specific cycles or cycle ranges using the `cycle_ranges` parameter.

```python
# Download only cycles 1-10
df = api.download_parquet_file(
    file_id,
    cycle_ranges=[{"RangeStart": 1, "RangeEnd": 10}]
)
```

```python
# Download last 5 cycles
df = api.download_parquet_file(
    file_id,
    cycle_ranges=[{
        "RangeStart": 5,
        "IsStartFromBack": True,
        "RangeEnd": 1,
        "IsEndFromBack": True
    }]
)
```

```python
# Download specific cycles (1, 5, 10, 50)
df = api.download_parquet_file(
    file_id,
    cycle_ranges=[
        {"Single": 1},
        {"Single": 5},
        {"Single": 10},
        {"Single": 50}
    ]
)
```

```python
# Download first hour of data
df = api.download_parquet_file(
    file_id,
    test_time_start=0,
    test_time_end=3600
)
```

#### Extract Metadata from Parquet Files

Parquet files contain embedded metadata including cell info, timestamps, cycle counts, and per-cycle summaries. Extract this metadata using `unpack_parquet()` (requires `pyarrow`).

```python
# From a saved file
result = api.unpack_parquet('file.parquet')

df = result['data']                    # Cycle test data
metadata = result['metadata']          # Cell metadata (name, barcode, timestamps, etc.)
cycle_summaries = result['cycle_summaries']  # Per-cycle summary statistics
```

```python
# From bytes (no file needed)
parquet_bytes = api.download_parquet_file(file_id, return_type='bytes')
result = api.unpack_parquet(parquet_bytes)

df = result['data']
metadata = result['metadata']
cycle_summaries = result['cycle_summaries']
```

```python
# Extract and save metadata as CSV files for easy viewing
result = api.unpack_parquet('file.parquet', save_metadata=True)

# Creates:
# - file_metadata.csv
# - file_cycle_summaries.csv
```

```python
# Batch processing: Download multiple files without loading into memory
file_ids = table['id'].head(10).to_list()
paths = []

for file_id in file_ids:
    path = api.download_parquet_file(file_id, return_type='path')
    paths.append(path)

# Later, process files one at a time (memory efficient)
for path in paths:
    result = api.unpack_parquet(path)
    df = result['data']
    # Process df...
```

## Cells Table
### Download Cell ID Information
Retrieve a list of cell names and GUIDs from the Micantis database with flexible filtering options.

#### Optional parameters
- `search`: Search string (same syntax as the Micantis WebApp)
- `barcode`: Search for a specific barcode
- `limit`: Number of results to return (default: 500)
- `min_date`: Only return results after this date
- `max_date`: Only return results before this date
- `show_ignored`: Include soft-deleted files (default: `True`)

``` python
search = "*NPD*"
cells_df = api.get_cells_list(search=search)
cells_df.head()
```
### Download Cell Metadata

Fetch per-cell metadata and return a clean, wide-format DataFrame.

#### Parameters:
- `cell_ids`: **List[str]**  
  List of cell test GUIDs (**required**)

- `metadata`: **List[str] (optional)**  
  List of metadata **names** (e.g., `"OCV (V)"`) or **IDs**.  
  If omitted, all non-image metadata will be returned by default.

- `return_images`: **bool (optional)**  
  If `True`, includes image metadata fields. Default is `False`.

---

#### 📘 Examples

```python
# Example 1: Get all non-image metadata for a list of cells
cell_ids = cells_df["id"].to_list()
cell_metadata_df = api.get_cell_metadata(cell_ids=cell_ids)
```
```python
# Example 2: Get specific metadata fields by name
cell_metadata_df = api.get_cell_metadata(
    cell_ids=cell_ids,
    metadata=["Cell width", "Cell height"],
    return_images=False
)
```
```python
# Merge cell metadata table with cell names to get clean dataframe
# Merge id with Cell Name (as last column)
id_to_name = dict(zip(cells_df['id'], cells_df['name']))
cells_metadata_df['cell_name'] = cells_metadata_df['id'].map(id_to_name)
cells_metadata_df.head()
```

## Specifications Table
### Download Specifications List
Retrieve specifications with their associated user properties.

```python
# Get all specifications with their user properties
specs_df = api.get_specifications_table()
specs_df.head()
```

## Test Management
### Download Test Requests List
Retrieve test request data with flexible date filtering.

#### Optional parameters
- `since`: Date string in various formats (defaults to January 1, 2020 if not provided)
  - Full month names: `"May 1, 2025"`, `"January 15, 2024"`
  - ISO format: `"2025-05-01"` or `"25-05-01"`

```python
# Get all test requests (defaults to since 2020-01-01)
test_requests = api.get_test_request_list()

# Get test requests since a specific date using month name
test_requests = api.get_test_request_list(since="May 1, 2024")

# Get test requests using ISO format
test_requests = api.get_test_request_list(since="2024-05-01")
```

### Download Failed Test Requests
Retrieve only failed test requests with the same date filtering options.

```python
# Get failed test requests since a specific date
failed_requests = api.get_failed_test_requests(since="January 1, 2024")
failed_requests.head()
```

### Get Individual Test Request Details
Retrieve full details for a specific test request by ID.

**New Feature:** Multiple output format options for better data analysis!

#### Format Options
- `return_format='dict'`: Raw dictionary (default, backwards compatible)
- `return_format='dataframes'`: Returns 3 DataFrames - summary, tests, and status_log ⭐ **Recommended**
- `return_format='flat'`: Single-row DataFrame with basic info

```python
# Option 1: Dictionary format (default, backwards compatible)
request_id = "your-test-request-guid"
test_details = api.get_test_request(request_id)

# Option 2: DataFrames format (recommended for analysis) ⭐
test_details = api.get_test_request(request_id, return_format='dataframes')
print(test_details['summary'])      # Basic request information
print(test_details['tests'])        # All requested tests
print(test_details['status_log'])   # Status change history

# Option 3: Flat DataFrame (best for combining multiple requests)
test_details = api.get_test_request(request_id, return_format='flat')
```

#### Batch Processing Multiple Requests
```python
# Get summaries for multiple test requests
request_ids = test_requests['id'].head(10).to_list()

all_summaries = []
for req_id in request_ids:
    summary = api.get_test_request(req_id, return_format='flat')
    all_summaries.append(summary)

# Combine into single DataFrame
combined_df = pd.concat(all_summaries, ignore_index=True)
print(f"Retrieved {len(combined_df)} test requests")
combined_df.head()
```

## Write Cell Metadata
Micantis lets you programmatically assign or update metadata for each cell using either:
- the human-readable field name (e.g., "Technician", "Weight (g)")
- or the internal propertyDefinitionId (UUID)

#### 📘 Examples

```python
# Example 1: Update the technician field for a cell
changes = [
    {
        "id": "your-cell-test-guid-here",  # cell test GUID
        "field": "Technician",
        "value": "Mykela"
    },
    {
        "id": "your-cell-test-guid-here",
        "field": "Weight (g)",
        "value": 98.7
    }
]

api.write_cell_metadata(changes=changes)

# Verify the changes
api.get_cell_metadata(cell_ids=["your-cell-test-guid-here"], metadata=['Weight (g)', 'Technician'])
```

```python
# Example 2: Update using propertyDefinitionId (advanced)
changes = [
    {
        "id": "your-cell-test-guid-here",
        "propertyDefinitionId": "your-property-definition-guid",
        "value": 98.7
    }
]

api.write_cell_metadata(changes=changes)

# Verify the changes
api.get_cell_metadata(cell_ids=["your-cell-test-guid-here"], metadata=['Weight (g)', 'Technician'])
```

## Stitch Data
Combine multiple data sets into a single stitched data set. This is useful for creating continuous test data from multiple separate test runs.

#### Parameters
- `name`: **str (required)**
  Name for the stitched data set

- `cell_data_ids`: **List[str] (required)**
  List of cell data GUIDs to stitch together

- `increment_cycle_number`: **bool (optional)**
  Whether to increment cycle numbers when stitching. Default is `False`.

- `advanced_mode`: **bool (optional)**
  Advanced mode for manual ordering of data sets. Default is `False`.

- `archive_source_data`: **bool (optional)**
  Archive (soft delete) the source data sets after stitching. Default is `False`.

- `id`: **str (optional)**
  Optional ID for updating an existing stitched data set. Leave `None` to create new.

- `allow_async`: **bool (optional)**
  If `True`, runs asynchronously and returns job ID. If `False`, waits for completion. Default is `False`.

#### Returns
- If `allow_async=False`: Dictionary with `'stitched_data_id'`
- If `allow_async=True`: Dictionary with `'job_id'`

#### 📘 Examples

```python
# Example 1: Stitch multiple test runs together
cell_data_ids = ["guid1", "guid2", "guid3"]

result = api.stitch_data(
    name="Combined Test Data",
    cell_data_ids=cell_data_ids,
    increment_cycle_number=True
)

print(f"Stitched data ID: {result['stitched_data_id']}")
```

```python
# Example 2: Stitch and archive source data
result = api.stitch_data(
    name="Complete Test Sequence",
    cell_data_ids=cell_data_ids,
    increment_cycle_number=True,
    archive_source_data=True  # Source files will be soft-deleted
)

# Download the stitched result
stitched_df = api.download_parquet_file(result['stitched_data_id'])
```

```python
# Example 3: Async mode for large data sets
result = api.stitch_data(
    name="Large Combined Dataset",
    cell_data_ids=large_cell_data_list,
    allow_async=True
)

# Poll until complete, then get the stitched data ID
stitched_id = api.wait_for_job(result['job_id'])
print(f"Stitched data ID: {stitched_id}")
```

## Archive Data
Archive (soft delete) data sets. Archived data is hidden from the default list view but not permanently deleted and can be unarchived.

**Note:** The `archive_data()` method toggles the archive status - calling it on an archived file will unarchive it.

#### Parameters
- `cell_data_id`: **str (required)**
  Cell data GUID to archive/unarchive

#### Returns
- Dictionary with `'id'` and `'archived'` (bool indicating new state)

#### 📘 Examples

```python
# Example 1: Archive a single data set
result = api.archive_data(cell_data_id="your-guid-here")
print(f"File archived: {result['archived']}")  # True if now archived
```

```python
# Example 2: Archive multiple data sets
cell_data_ids = ["guid1", "guid2", "guid3"]

for cell_id in cell_data_ids:
    result = api.archive_data(cell_id)
    print(f"Archived {cell_id}: {result['archived']}")
```

```python
# Example 3: Unarchive by calling again
result = api.archive_data(cell_data_id="your-guid-here")
print(f"File archived: {result['archived']}")  # False if now unarchived
```

```python
# Example 4: Combined workflow - stitch and archive
# Stitch data with automatic archiving
result = api.stitch_data(
    name="Combined Data",
    cell_data_ids=cell_ids,
    archive_source_data=True  # Automatically archives source files
)

# Or manually archive after stitching
result = api.stitch_data(name="Combined Data", cell_data_ids=cell_ids)
for cell_id in cell_ids:
    api.archive_data(cell_id)
```

## Download CSV (Graph Request)
Download time-series data as a CSV file using a graph request definition. Useful for exporting specific channels or time windows from one or more data sets.

#### Parameters
- `graph_request`: **dict (required)**
  Graph request payload specifying which data sets and axes to export
- `output_path`: **str (optional)**
  Path to save the CSV file. Defaults to `'download.csv'`
- `return_type`: **str (optional)**
  `'dataframe'` (default), `'path'`, or `'bytes'`

#### Return Type Options
- **`'dataframe'`** (default): Saves file and returns pandas DataFrame
- **`'path'`**: Saves file and returns path string
- **`'bytes'`**: Returns raw bytes without saving — best for cloud uploads

#### 📘 Examples

```python
# Download as DataFrame (default)
graph_request = {
    "items": [{"id": "your-file-guid", "yAxis": 0, "xAxis": 0}],
    "xMin": None,
    "xMax": None,
}

df = api.download_csv(graph_request=graph_request)
```

```python
# Save to a specific path
path = api.download_csv(
    graph_request=graph_request,
    output_path="my_data.csv",
    return_type='path'
)
```

```python
# Get raw bytes for cloud upload
csv_bytes = api.download_csv(graph_request=graph_request, return_type='bytes')
blob_client.upload_blob(name='data.csv', data=csv_bytes)
```

## Clean Data
Re-process one or more data sets using one of three cleaning modes. Returns the IDs of the newly created cleaned data sets.

> **Note:** Data Cleaning is available in version 2.17+, expected to release end of March 2026. Please contact Micantis with questions about upgrading your system.

#### Parameters
- `source_ids`: **List[str] (required)**
  List of cell data GUIDs to clean
- `mode`: **str (optional)**
  Cleaning mode. One of `'CycleAutoFixup'` (default), `'FilterCycles'`, or `'Parametric'`
- `filter_cycles_definition`: **dict (optional)**
  Required when `mode='FilterCycles'`
- `parametric_definition`: **dict (optional)**
  Required when `mode='Parametric'`. Must include an `'actions'` list
- `allow_async`: **bool (optional)**
  If `True`, runs asynchronously and returns a job ID. Default is `False`
- `timeout`: **int (optional)**
  Request timeout in seconds. Default is `60`

#### Returns
- If `allow_async=False`: Dictionary with `'cleaned_data_ids'`
- If `allow_async=True`: Dictionary with `'job_id'`

#### 📘 Examples

```python
# Example 1: Recompute cycle boundaries (most common)
result = api.clean_data(
    source_ids=["your-guid-here"],
    mode='CycleAutoFixup'
)
print(f"Cleaned data IDs: {result['cleaned_data_ids']}")
```

```python
# Example 2: Remove non-nominal cycles
result = api.clean_data(
    source_ids=["your-guid-here"],
    mode='FilterCycles',
    filter_cycles_definition={}
)
```

```python
# Example 3: Parametric — remove specific cycles
result = api.clean_data(
    source_ids=["your-guid-here"],
    mode='Parametric',
    parametric_definition={
        "actions": [
            {
                "$type": "removeSteps",
                "name": "Remove first 3 cycles",
                "cycleRanges": [{"RangeStart": 1, "RangeEnd": 3}]
            }
        ]
    }
)
```

```python
# Example 4: Async mode for large data sets
result = api.clean_data(
    source_ids=["your-guid-here"],
    mode='CycleAutoFixup',
    allow_async=True
)

# Poll until complete, then get the cleaned data IDs
cleaned_ids = api.wait_for_job(result['job_id'])
print(f"Cleaned data IDs: {cleaned_ids}")
```

## Job Management
Monitor and control background jobs created by async operations (`stitch_data`, `clean_data`).

### `wait_for_job` — recommended for most use cases
Polls a job until it finishes and returns the result in one call.

#### Parameters
- `job_id`: **str (required)** — GUID returned by an async operation
- `poll_interval`: **int (optional)** — seconds between status checks (default: `2`)
- `timeout`: **int (optional)** — maximum seconds to wait before giving up (default: `300`)

```python
result = api.stitch_data(name="My Stitch", cell_data_ids=[...], allow_async=True)
stitched_id = api.wait_for_job(result['job_id'])
```

### `get_job_status` — check status without blocking
Returns the current status without waiting.

```python
status = api.get_job_status(job_id)
# {
#   'id': '...',
#   'status': 'Running',   # Created | Running | Completed | Failed | Cancelled
#   'progress': 42,        # 0–100
#   'errorMessage': None
# }
print(f"Progress: {status['progress']}%")
```

### `get_job_result` — fetch the result of a completed job
Returns the same value as `wait_for_job` once the job has finished. Returns `None` with a warning if the job is still running.

```python
result = api.get_job_result(job_id)
```

### `cancel_job` — cancel a running job

```python
api.cancel_job(job_id)
```

### Full async polling example

```python
# Submit async
result = api.clean_data(source_ids=["guid1", "guid2"], mode='CycleAutoFixup', allow_async=True)
job_id = result['job_id']

# Option A: wait and get result automatically
cleaned_ids = api.wait_for_job(job_id, poll_interval=5, timeout=600)

# Option B: manual polling loop
import time
while True:
    status = api.get_job_status(job_id)
    print(f"  {status['status']} — {status['progress']}%")
    if status['status'] in ('Completed', 'Failed', 'Cancelled'):
        break
    time.sleep(5)

if status['status'] == 'Completed':
    cleaned_ids = api.get_job_result(job_id)
```

---

## Data Changelog
Retrieve a list of all data sets that have changed since a given timestamp. Useful for syncing or auditing data updates.

#### Parameters
- `since`: **str (required)**
  ISO 8601 timestamp. Only items modified after this time are returned.
  Example: `'2025-01-01T00:00:00Z'`
- `return_type`: **str (optional)**
  `'df'` (default) returns a flat DataFrame. `'dict'` returns the raw response
- `timeout`: **int (optional)**
  Request timeout in seconds. Default is `120`. Increase for wide date ranges

#### 📘 Examples

```python
# Get all changes since a date as a DataFrame
changelog = api.get_changelog(since="2025-01-01T00:00:00Z")
print(f"Changed files: {len(changelog)}")
changelog.head()
```

```python
# Get raw dict response
changelog = api.get_changelog(since="2025-06-01T00:00:00Z", return_type='dict')
print(changelog['items'])
```

```python
# Wide date range — increase timeout
changelog = api.get_changelog(since="2024-01-01T00:00:00Z", timeout=300)
```

## Duplicate Files
Retrieve groups of files with similar names that may be duplicates. Each group is a list of file records.

#### 📘 Examples

```python
# Get all duplicate file groups
dupes = api.get_duplicate_files()
print(f"Duplicate groups found: {len(dupes)}")

# Inspect the first group
for item in dupes[0]:
    print(f"  {item['name']} — {item['id']}")
```

```python
# Archive all but the first file in each duplicate group
for group in dupes:
    for item in group[1:]:  # Keep the first, archive the rest
        api.archive_data(item['id'])
```

## Upload Python Execution Artifact
Attach an output file (PNG, CSV, XLSX, etc.) to a running or completed Python script execution. Called from within a Python script running in the Micantis environment.

#### Parameters
- `execution_id`: **str (required)**
  The execution ID (GUID) of the running or completed execution
- `file_path`: **str (required)**
  Local path to the file to upload. Maximum size: 50 MB
- `title`: **str (required)**
  Human-readable title for the artifact
- `filename`: **str (optional)**
  Override the filename stored on the server. If not provided, derived from the title and file extension

#### Returns
- Dictionary with `'name'`, `'title'`, `'contentType'`, and `'sizeBytes'`

#### 📘 Examples

```python
# Upload a plot generated during a Python execution
result = api.upload_execution_artifact(
    execution_id="your-execution-guid",
    file_path="voltage_plot.png",
    title="Voltage vs Time"
)
print(f"Uploaded: {result['name']} ({result['sizeBytes']:,} bytes)")
```

```python
# Upload a CSV results file with a custom filename
result = api.upload_execution_artifact(
    execution_id="your-execution-guid",
    file_path="results.csv",
    title="Cycle Summary Results",
    filename="cycle_summary_2025.csv"
)
```

