Metadata-Version: 2.4
Name: model_config_tests
Version: 0.2.2
Summary: Test for ACCESS model (payu) configurations
Author: ACCESS-NRI
License: Apache-2.0
Project-URL: Homepage, https://github.com/ACCESS-NRI/model-config-tests/
Project-URL: Issues, https://github.com/ACCESS-NRI/model-config-tests/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: POSIX :: Linux
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: f90nml>=0.16
Requires-Dist: requests
Requires-Dist: PyYAML
Requires-Dist: requests
Requires-Dist: pytest>=8.0.1
Requires-Dist: ruamel.yaml>=0.18.5
Requires-Dist: jsonschema>=4.21.1
Requires-Dist: payu>=1.1.3
Requires-Dist: pytest-sugar
Requires-Dist: netCDF4
Provides-Extra: test
Requires-Dist: pytest-cov; extra == "test"
Dynamic: license-file

# Model Configuration Pytests and CI

This repository houses pytests that are used as part CI checks for model configurations, as well as reusable components of model configuration CI.

The checksum pytests are used for reproducibility CI checks in this repository. The quick configuration tests are used in any repository that calls `config-pr-1-ci.yml` or is templated by [`ACCESS-NRI/model-configs-template](https://github.com/ACCESS-NRI/model-configs-template). For example, [ACCESS-NRI/access-om2-configs](https://github.com/ACCESS-NRI/access-om2-configs).

Code from these pytests is adapted from COSIMAS's ACCESS-OM2's [bit reproducibility tests](https://github.com/COSIMA/access-om2/blob/master/test/test_bit_reproducibility.py).

## model-config-tests Pytests

### How to run pytests manually on NCI

1. Load payu module - this provides the dependencies needed to run the model

    ```sh
    module use /g/data/vk83/modules
    module load payu/1.1.6
    ```

2. Create and activate a python virtual environment for installing and running tests

    ```sh
    python3 -m venv <path/to/test-venv> --system-site-packages
    source <path/to/test-venv>/bin/activate
    ```

3. Either pip install a released version of `model-config-tests`,

    ```sh
    pip install model-config-tests==0.1.1
    ```

    Or to install `model-config-tests` in "editable" mode, first clone the repository, and then run pip install from the repository. This means any changes to the code are reflected in the installed package.

    ```sh
    git clone https://github.com/ACCESS-NRI/model-config-tests/ <path/to/model-config-tests>
    pip install -e <path/to/model-config-tests>
    ```

4. Checkout an experiment (in this case it is using an ACCESS-OM2 config)

    ```sh
    git clone https://github.com/ACCESS-NRI/access-om2-configs/ <path/to/experiment>
    cd <path/to/experiment>
    git checkout <branch/tag>
    ```

5. Run the pytests

    ```sh
    model-config-tests --help
    ```

6. Once done with testing, deactivate the virtual environment, and if the environment is no longer needed, remove the environment

    ```sh
    deactivate
    rm -rf <path/to/test-venv> # Deletes the test environment
    ```

### Pytest Options

The output directory for pytests defaults to `$TMPDIR/test-model-repro` and contains the following subdirectories:

- `control` - contains copies of the model configuration used for each experiment run in the tests.
- `lab` - contains `payu` model output directories containing `work` and `archive` sub-directories.

This output directory also contains files generated by pytests, including the `checksum` directory which is used as part of reproducibility CI workflows.

To specify a different folder for pytest outputs, use `--output-path` command flag, for example:

```sh
model-config-tests --output-path /some/other/path/for/output
```

By default, the control directory, e.g. the model configuration to test, is the current working directory. This can be set similarly to above by using the
`--control-path` command flag.

The path containing the checksum file to check against can also be set using
`--checksum-path` command flag. The default is the `testing/checksum/historical-<default-model-runtime>hr-checksums.json` file which is stored in the control directory.

### Selecting tests using markers

Running all tests in the pytest suite on a configuration will likely fail as there's specific tests for some model configurations. Pytest markers are used to selectively run different types of tests. Current markers include:

<a name="pytest_markers"></a>

- `repro`: All available reproducibility tests (all `repro_` test markers but `repro_determinism_restart`).
- `repro_historical`: Historical reproducibility test that confirms results from a model run match a stored previous result.
- `repro_determinism`: Determinism test that confirms repeated model runs give the same result.
- `repro_determinism_restart`: Determinism test that confirms repeated experiments with two consecutive runs give the same result.
- `repro_restart`: Restart reproducibility test that confirms two short consecutive model runs give the same result as a longer single model run.
- `slow`: Tests that are slow to run
- `dev_config`: General configuration QA tests.
- `config`: Configuration QA tests for released branches. This includes the `dev_config` tests.

There are also model-specific markers for configuration QA tests, e.g., `access_om2`, `access_esm1p5`, `access_om3` and `access_esm1p6`. For a list of all available markers,
run:

```sh
model-config-tests --markers
```

The `-m` command-line option is used to run tests with specific markers. For example, to run only the `repro` marked tests:

```sh
model-config-tests -m repro
```

To select a combination of tests, use logical operators such as `or`, `and` and `not`.
For example, to run both of the general release configuration QA tests and the ACCESS-OM2 specific QA tests:

```sh
model-config-tests -m "config or access_om2"
```

## Definitions for reproducibility
<a name="definitions_reproducibility"></a>

It is helpful if we work from the same definitions of what 'reproducibility' means, we consider four kinds:
 1. _Determinism_ `repro_determinism`: an identical calculation run under the same conditions should produce the same result. Some sub-categories of specific interest to earth-system models:
      1. _Determinism rest_: Repeated runs give the same result from rest;
      1. _Determinism restart_: Repeated runs give the same result from a restart file;
 1. _Restart reproducibility_ `repro_restart`: Two short runs match a single longer run, in other words, reproducibility across a restart boundary;
 1. _Historical reproducibility_ `repro_historical`: Match a stored previous result;
 2. _Build reproducibilty_: Match a stored previous result when using the same build description files on the same machine.

Code block names above indicate tests that are currently available using the CI/CD system, see [pytest markers](#pytest_markers)

## Check for pairwise reproducibility between multiple experiments

To compare output/checksums between pairs of
experiment directories, use the `compare-exp-tests` command. For example

```
compare-exp-tests --dirs "path/to/exp1 path/to/exp2 path/to/exp3"
```

The `--dirs` option specifies a space separated list of experiment directories to compare. These can be relative or absolute paths and should point to payu control directories. The above example generates three pairwise tests: "exp1 vs exp2", "exp1 vs exp3", and "exp2 vs exp3".

To enable more detailed output during the test runs, add the `-vvv` flag to the command.

Currently these tests only compare the first model run output directory.

## CI/CD

The `.github` directory contains many different workflows and actions. This section describes how they are used.

### CI/CD For This Repository

`CI.yml` and `CD.yml` are used to test, package and upload the `model-config-tests` package that is used by `model-configs`-style repositories across the ACCESS-NRI. These are the only workflows that run on this repository. The others are reusable workflows called by `model-configs`-style repositories, among others.

### Reusable CI

The `config-*.yml`, `generate-checksums.yml` and `test-repro.yml` workflows are called by `model-configs`-style repositories to test model configurations. They are stored in this repository to allow a central place to update generic CI used by all model configuration repositories.

#### Repos That Use Reusable CI

Currently, these repositories make use of the reusable CI:

- [access-om2-configs](https://github.com/ACCESS-NRI/access-om2-configs)
- [access-esm1.5-configs](https://github.com/ACCESS-NRI/access-esm1.5-configs)
- [access-esm1.6-configs](https://github.com/ACCESS-NRI/access-esm1.6-configs)
- [access-om3-configs](https://github.com/ACCESS-NRI/access-om3-configs)

Below is information on the use of these workflows.

#### `config-pr-*.yml` Pipeline

The `config-pr-*` Pipeline is a series of workflows that govern the testing, ChatOps and merging procedures of pull requests for model configuration repositories, such as [`ACCESS-NRI/access-om2-configs`](https://github.com/ACCESS-NRI/access-om2-configs).

Essentially, these files work on two types of configuration branch pull requests in the model configuration repository. More information on the terminology used in model configuration repositories can be found in the `README.md` of the `ACCESS-NRI/model-configs-template` repository. The types of pull requests are:

- Pull requests into `dev-*`: Allows quick checks of configuration metadata and common mistakes in configurations during PRs into the `dev-*` configuration branches.
- Pull requests from `dev-*` into `release-*`: Allows both quick checks, as well as a longer, more comprehensive check on the reproducibility of the changes being brought into the `release-*` configuration branch, compared to the previous `release-*` commit. It also acts on 'comment commands' run during the pull request, like `!bump` for updating the version of the configuration ([see the 'Config Tags' section](https://github.com/ACCESS-NRI/model-configs-template/blob/main/README.md) in the `ACCESS-NRI/model-configs-template` repository for more). It is also responsible for the creation of the final config tag and release, once merged.

#### `config-schedule-*.yml` Pipeline

The `config-schedule-*` Pipeline is a series of workflows used to check the reproducibility of certain config tags against themselves, every month. This is used as a kind of canary to make sure that we continue to get the same results on the same deployment targets.

#### `generate-checksums` Reusable Workflow

This workflow is used to easily generate the checksums used in the reproducibility checks, for a specific branch of a model configuration repository, if they don't already exist. This is most often used for the initial commit of a checksum to the `release-*` configuration branch.

#### `test-repro` Reusable Workflow

This workflow is used to test the reproducibility of a given model repository against historical checksums, and can be used as a standalone workflow.

Using it has some requirements outside of just filling in the inputs: One must have a valid GitHub Environment (specified by the `environment-name` input) in the calling repository, that has the following `secrets` and `vars` defined:

- `secrets.SSH_HOST` - hostname for the deployment target
- `secrets.SSH_HOST_DATA` - hostname for the data mover on the deployment target (if it exists)
- `secrets.SSH_KEY` - private key for access to the deployment target
- `secrets.SSH_USER` - username for access to the deployment target
- `vars.EXPERIMENTS_LOCATION` - directory on the deployment target that will contain all the experiments used during testing of reproducibility across multiple runs of this workflow (ex. `/scratch/some/directory/experiments`)
- `vars.MODULE_LOCATION` - directory on the deployment target that contains module files for payu used during reproducibility testing (ex. `/g/data/vk83/modules`)
- `vars.PRERELEASE_MODULE_LOCATION` - directory on the deployment target that contains module files for development version of payu (ex. `/g/data/vk83/prerelease/modules`)

#### `config-comment-test` Reusable Workflow

This comment command allows a repro check from any branch, rather than just the `dev-*` -> `release-*` PR pipeline.

It requires all the `secrets`/`vars` defined on the caller (as above), as well as `vars.CONFIG_CI_SCHEMA_VERSION`.

Usage is as follows:

```txt
!test TYPE [commit]
```

Where `TYPE` is a test suite that we support. Currently, this consists of `repro`.

The commands are all case sensitive and require lower case.

Using `commit` as an option will commit the result of the test to the PR, provided the commenter has at least `write` permission, and the checksums differ.

##### `!test repro`

`!test repro`: will compare the `HEAD` of the current PR source branch against the common ancestor on the target branch. For example, in the below diagram we would be comparing generated checksums in `C` against checksums already stored at `A`:

```txt
D   (PR Target Branch)
|
| C (PR Source Branch)
| |
| B
|/
A   (Common Ancestor)
```

### CI Configuration File

Each model configuration repository requires a `config/ci.json`
file for defining CI test configuration. This file specifies what scheduled tests to run, and what tests to run for a given git branch (or tag) and test type.

#### Test types

The different types of test are defined as:

- `scheduled`: Scheduled reproducibility tests run on NCI. These are typically scheduled to run monthly but this can re-configured by modifying the cron in the `.github/workflows/schedule.yml` workflow. The keys under these tests are released tags or branches to run the scheduled tests.
- `reproducibility`: Reproducibility tests run on NCI as part of pull requests (PRs). These are automatically run for PRs from development (`dev-`) branches to released (`release-`) branches. These tests can also be triggered manually in a PR using the `!test repro` command. The keys under these tests represent the target branches into which PRs are being merged.
- `qa` - Quick quality assurance tests are run locally on a GitHub Runner as part of PRs. These are consistency checks which run without needing to run the model. These are run automatically for any PR being merged into development or released branches. The keys under these tests represent the target branches into which PRs are being merged.

#### Test Configuration Settings

The configuration properties needed to run the tests are:

| Name | Type | Description |  Example |
| ---- | ---- | ----------- | -------- |
| markers | `string` | Markers used for pytest, in the Python format | `repro` |
| model-config-tests-version | `string` | The version of the model-config-tests | `0.1.1` |
| python-version | `string` | The python version used to create test virtual environment on GitHub hosted tests | `3.11.0` |
| payu-version | `string` | The Payu version used to run the model | `1.1.6` |

Pytest markers select what pytests to run in `model-config-tests`. For more infomation on tests currently available, see [pytest markers](#pytest_markers).

The Payu version is the module version loaded on NCI to build the base of the test virtual environment and is only relevant for tests run on NCI. To use the development Payu module which has all the latest changes in Payu repository, set `"payu-version": "dev"`.

A git branch, tag, or commit of the `model-config-tests` repository can be used for `model-config-tests-version`, and the CI will install from Github if the package can't be installed from PyPI. For development model configuration branches it may be useful to use the most recent code in the `model-config-tests` repository. Using the `main` branch will achieve this. This is good for continuous integration, however it is recommended to use released versions of `model-config-tests` and `payu` for the released model configurations so that tests are stable. 

#### File Structure

The schema for this file is found in [`ACCESS-NRI/schema`](https://github.com/ACCESS-NRI/schema).  As most of the tests use the same test and python versions, and similar markers, there are two levels of defaults. There's a default at the test type level which is useful for defining test markers - this selects certain pytests to run in `model-config-tests`. There is an outer global default, which is used if a property is not defined for a given branch/tag, and it is not defined for the test default. The [`parse-ci-config`](.github/actions/parse-ci-config/README.md) action applies this fall-back default logic.

There is also support for using regex patterns for the git branches for `qa` and `reproducibility` tests, e.g.
`dev-.*` will match all development branches. If there are multiple regex patterns, priority will be given to the order so more specific patterns should
be higher in the list, e.g. `dev-1deg-.*` before `dev-.*`. Note that regex patterns are currently not supported for `scheduled` tests so every tag that requires a scheduled test needs to be defined.

For example, given the following `config/ci.json` file:

```json
{
    "$schema": "<https://github.com/ACCESS-NRI/schema/tree/main/au.org.access-nri/model/configuration/ci/2-0-0.json>",
    "scheduled": {
        "release-1deg_jra55_ryf-2.0": {},
        "default": {
            "markers": "repro"
        },
    },
    "reproducibility": {
        "dev-1deg_jra55do_ryf": {
            "markers": "repro"
        },
        "release-1deg_jra55do_ryf": {
            "markers": "repro"
        },
        "dev-example-branch": {
            "payu-version": "dev",
            "model-config-tests-version": "main"
        },
        "default": {
            "markers": "repro and (not slow)"
        },
    },
    "qa": {
        "dev-example-branch": {
            "model-config-tests-version": "main"
        },
        "dev-.*": {
            "markers": "dev_config"
        },
        "default": {
            "markers": "config"
        }
    },
    "default": {
        "model-config-tests-version": "0.0.11",
        "python-version": "3.11.0",
        "payu-version": "1.1.6"
    }
}
```

The above configuration triggers monthly scheduled tests for `release-1deg_jra55_ryf-2.0` GitHub tag. This scheduled test will run with `repro` test marker, and the top-level defaults for `model-config-tests-version`, `python-version` and `payu-version`.

If a PR was being merged into a `release-1deg_jra55do_ryf` branch, the QA tests run on the GitHub runner would use the `config` test markers and the repro tests on NCI, would select the `repro` tests.

If a PR was being merged into a `dev-example-branch`, the QA tests that run automatically would use the `dev_config` tests and if repro tests were triggered manually, then it would use the `repro and (not slow)` tests.
For both test types, it would use the latest changes in the `model-config-tests` repository. For repro tests, it would use the `payu/dev` module on NCI for running the model.
