Metadata-Version: 2.1
Name: cirro
Version: 1.2.7
Summary: CLI tool and SDK for interacting with the Cirro platform
Home-page: https://github.com/CirroBio/Cirro-client
License: MIT
Author: Cirro Bio
Author-email: support@cirro.bio
Requires-Python: >=3.9,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: attrs (>=21.3.0)
Requires-Dist: boto3 (>=1.35,<2.0)
Requires-Dist: cirro_api_client (==0.2.1)
Requires-Dist: click (>=8.1.3,<9.0.0)
Requires-Dist: jsonschema (>=4.21.1,<5.0.0)
Requires-Dist: msal-extensions (>=1.0.0,<2.0.0)
Requires-Dist: pandas (>=2.2.0,<3.0.0)
Requires-Dist: pyjwt (>=2.10.0,<3.0.0)
Requires-Dist: questionary (>=2.0.1,<3.0.0)
Requires-Dist: requests (>=2.31.0,<3.0.0)
Requires-Dist: s3fs (==2024.10.0)
Requires-Dist: tqdm (>=4.62.3,<5.0.0)
Project-URL: Repository, https://github.com/CirroBio/Cirro-client
Description-Content-Type: text/markdown

# Cirro Client

[![Build Python package](https://github.com/FredHutch/Cirro-client/actions/workflows/package.yml/badge.svg)](https://github.com/FredHutch/Cirro-client/actions/workflows/package.yml)
[![Lint and run tests](https://github.com/FredHutch/Cirro-client/actions/workflows/lint.yml/badge.svg)](https://github.com/FredHutch/Cirro-client/actions/workflows/lint.yml)
![](https://img.shields.io/pypi/v/cirro.svg)
[![Quality Gate Status](https://sonarcloud.io/api/project_badges/measure?project=CirroBio_Cirro-client&metric=alert_status)](https://sonarcloud.io/summary/new_code?id=CirroBio_Cirro-client)

A Python 3.9+ library for the Cirro platform.

## Installation

You can install Cirro using pip:

`pip install cirro`

or you can install the main branch of the repo by running:

`pip install git+https://github.com/CirroBio/Cirro-client.git`

## Authentication

Upon first use, the Cirro client will ask you what Cirro instance to use and if you would like to save your login information.
It will then give you a link to authenticate through the web browser.

You can change your Cirro instance by running `cirro configure` and selecting the desired instance.

If you need to change your credentials after this point, and you've opted to save your login, please see the [clearing saved login](#clearing-saved-login) section.

## Command Line Usage

#### Downloading a dataset:
```bash
Usage: cirro download [OPTIONS]

  Download dataset files

Options:
  --project TEXT         Name or ID of the project
  --dataset TEXT         ID of the dataset
  --data-directory TEXT  Directory to store the files
  -i, --interactive      Gather arguments interactively
  --help                 Show this message and exit.
```

#### Uploading a dataset:
```bash
Usage: cirro upload [OPTIONS]

  Upload and create a dataset

Options:
  --name TEXT             Name of the dataset
  --description TEXT      Description of the dataset (optional)
  --project TEXT          Name or ID of the project
  --process TEXT          Name or ID of the ingest process
  --data-directory TEXT   Directory you wish to upload
  -i, --interactive       Gather arguments interactively
  --help                  Show this message and exit.
```

#### Listing datasets:
```bash
Usage: cirro list-datasets [OPTIONS]

  List available datasets

Options:
  --project TEXT         ID of the project
  -i, --interactive      Gather arguments interactively
  --help                 Show this message and exit.
```

### Interactive Commands

When running a command, you can specify the `--interactive` flag to gather the command arguments interactively.

Example:

```bash
$ cirro upload --interactive
? What project is this dataset associated with?  Test project
? Enter the full path of the data directory  /shared/biodata/test
? Please confirm that you wish to upload 20 files (0.630 GB)  Yes
? What type of files?  Illumina Sequencing Run
? What is the name of this dataset?  test
? Enter a description of the dataset (optional)
```

## Python Usage

See the following set of Jupyter notebooks that contain examples on the following topics:

| Jupyter Notebook                                                   | Topic                                |
|--------------------------------------------------------------------|--------------------------------------|
| [Introduction](samples/Getting_started.ipynb)                      | Installing and authenticating        |
| [Uploading a dataset](samples/Uploading_a_dataset.ipynb)           | Uploading data                       |
| [Downloading a dataset](samples/Downloading_a_dataset.ipynb)       | Downloading data                     |
| [Interacting with a dataset](samples/Interacting_with_files.ipynb) | Calling data and reading into tables |
| [Analyzing a dataset](samples/Analyzing_a_dataset.ipynb)           | Running analysis pipelines           |
| [Using references](samples/Using_references.ipynb)                 | Managing reference data              |
| [Advanced usage](samples/Advanced_usage.ipynb)                     | Advanced operations                  |

## R Usage

| Jupyter Notebook                                    | Topic               |
|-----------------------------------------------------|---------------------|
| [Downloading a dataset in R](samples/Using-R.ipynb) | Reading data with R |

## Advanced Usage

View the API documentation for this library [here](https://cirrobio.github.io/Cirro-client/).

### Supported environment variables

| Name           | Description                   | Default  |
|----------------|-------------------------------|----------|
| CIRRO_HOME     | Local configuration directory | ~/.cirro |
| CIRRO_BASE_URL | Base URL of the data portal   |          |

### Configuration

The `cirro configure` command creates a file in `CIRRO_HOME` called `config.ini`.

You can set the `base_url` property in the config file rather than using the environment variable. 

The `transfer_max_retries` configuration property specifies the maximum number of times to attempt uploading a file to Cirro in the event of a transfer failure. 
When uploading files to Cirro, network issues or temporary outages can occasionally cause a transfer to fail.
It will pause for an increasing amount of time for each retry attempt.

The `enable_additional_checksums` property manages the utilization of SHA-256 hashing for enhanced data integrity. 
This feature computes the SHA-256 hash of a file during the upload process, and subsequently cross-validates it with the server upon completion.
When retrieving files, it ensures that the hash received matches the server's stored hash.
The default hashing algorithm for files is MD5. In many cases, MD5 is sufficient to ensure data integrity upon upload.

```ini
[General]
base_url = cirro.bio
transfer_max_retries = 15
enable_additional_checksums = true
```

### Clearing saved login

You can clear your saved login information by removing the `~/.cirro/token.dat` file from your system or
by running `cirro configure` and selecting **No** when it asks if you'd like to save your login information.

