Metadata-Version: 2.4
Name: bioblueprint
Version: 1.2.4
Summary: A workflow dendency graph compiler and automation enabler
License: GPL-3.0
License-File: LICENSE
Keywords: bioinformatics
Author: Zachary Konkel
Author-email: zachary.konkel@theiagen.com
Requires-Python: >=3.10,<4.0
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Dist: gitpython (>=3.1.45,<4.0.0)
Requires-Dist: matplotlib (>=3.10.6,<4.0.0)
Requires-Dist: miniwdl (>=1.13.0,<2.0.0)
Requires-Dist: networkx (>=3.5,<4.0)
Requires-Dist: pyyaml (>=6.0.2,<7.0.0)
Description-Content-Type: text/markdown

# bioblueprint

bioblueprint is a python library designed to enable workflow language-interchangeable dependency graph compilation and development automation. It operates by compiling workflows, their dependencies, Git diff between branches, and using the modified files to trace testing paths within the dependency graph.

**NOTE**: bioblueprint operates on *local* branches

<br>

## Install

`python3 -m pip install bioblueprint`

<br>

## Usage

Run bioblueprint after a development iteration or to automate testing, *then* edit the "DESCRIPTION" field in the IO TSVs. You can also run after PRs have been generated. 

Please see the help menu for a comprehensive list of input options.

`bioblueprint -i <REPO_BASE_DIR> -d <DEVELOPMENT_BRANCH>`

**DEVELOPMENT**: `-d` is the dev branch; `-s` is `main` (default)

**VALIDATION**: `-d` is `main`; `-s` is the previous release tag

**PULL REQUESTS**: A blank pull request is generated by default, but append `-pr #` to pull and use an existing PR.

<br>

## Outputs

An output directory `bioblueprint_YYYYmmdd/` will be generated containing the following files:

### `<REPO>.pr.md`

A populated pull request template with I/O modifications, WF modifications, and testing paths. If `-pr` is specified, the PR will be downloaded and relevant fields populated with I/O and testing information - existing testing data will be retained and unmodified if formatted as a checklist with exact workflow name matches that are the first entry following the markdown checkbox (links are permitted). This function is tailored for accounted repositories:

- [Public Health Bioinformatics](https://github.com/theiagen/public_health_bioinformatics)

### `<REPO>_inputs.tsv` & `<REPO>_outputs.tsv`

Updated inputs/outputs tables for Public Health Bioinformatics


### `testing/<WORKFLOW>.testing.json`

A JSON formatted with testing parameters (designed for [bioforklift](https://github.com/theiagen/bioforklift) integration) for hosted workflows:

```json
{
  "<WF_NAME>": {
    "path": "<PATH_RELATIVE_TO_REPO>",
    "modified": 
        "<TASK/WF_1>",
        ..
    ],
    "workflow_name": "<HOSTED_NAME>",
    "branch": "<DERIVED_BRANCH>",
    "repository": "<OWNER/REPOSITORY>",
    "table": "<WF_NAME>_<TESTING_SUFFIX>",
    "comment": "PR: <PR#>",
    "input_json": "<REPOSITORY_INPUTS_JSON>",
    "output_json": "<AVAILABLE_OUTPUTS>"
  },
  ..
}
```

The `input_json` is derived from an inputs JSON file hosted in the repository that corresponds to the testing table that is hosted in the Terra workspace.


### `<REPO>.io.json`

A JSON formatted to convey inputs and outputs, including defaults and types, for workflows:

```json
{
  <WF_NAME_1>: {
    "path": <PATH_RELATIVE_TO_REPO>,
    "inputs": {
        <INPUT_1>:
        {
            "type": <WF_LANGUAGE_TYPE>,
            "default": <DEFAULT_VAL>
        },
        ..
    },
    "outputs": {
        <OUTPUT_1>: <WF_LANGUAGE_TYPE>,
        ..
    }
  },
  ..
}
```
