Metadata-Version: 2.1
Name: marisco
Version: 0.3.0
Summary: MARIS companion package and tutorials
Home-page: https://github.com/franckalbinet/marisco
Author: Franck Albinet
Author-email: franckalbinet@gmail.com
License: Apache Software License 2.0
Keywords: nbdev jupyter notebook python
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas (==2.2)
Requires-Dist: openpyxl (==3.1.0)
Requires-Dist: fastcore
Requires-Dist: tqdm
Requires-Dist: netcdf4
Requires-Dist: tomli
Requires-Dist: tomli-w
Requires-Dist: shapely
Requires-Dist: pyzotero
Requires-Dist: jellyfish
Requires-Dist: xarray
Provides-Extra: dev
Requires-Dist: unitest ; extra == 'dev'

# MARISCO


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

The [IAEA Marine Radioactivity Information System
(MARIS)](https://maris.iaea.org) provides open access to radioactivity
measurements in marine environments. Developed by the [IAEA
Environmental
Laboratories](https://www.iaea.org/about/organizational-structure/department-of-nuclear-sciences-and-applications/division-of-iaea-environment-laboratories)
in Monaco, MARIS offers data on seawater, biota, sediment, and suspended
matter.

This Python package includes command-line tools to convert MARIS
datasets into [`NetCDF`](https://www.unidata.ucar.edu/software/netcdf/)
or `.csv` formats, enhancing compatibility with various scientific and
data analysis software.

## Core Concept: Handlers

`marisco` is built around the concept of `handlers` - specialized
modules designed to convert MARIS datasets into NetCDF format. Each
handler is tailored to a specific data provider and implemented as a
dedicated Jupyter notebook.

### Literate Programming Approach

We’ve adopted a Literate Programming approach, which means:

1.  **Documentation**: Each handler serves as comprehensive
    documentation.
2.  **Code Reference**: The notebooks contain the actual implementation
    code.
3.  **Communication Tool**: They facilitate discussions with data
    providers about discrepancies or inconsistencies.

### Powered by nbdev

To achieve this, we leverage [nbdev](https://nbdev.fast.ai), a powerful
tool that allows us to:

1.  Write code within Jupyter notebooks
2.  Automatically export relevant parts as dedicated Python modules

This approach bridges the gap between documentation and implementation,
ensuring they remain in sync.

### See It in Action

For a concrete example of this approach, check out our [HELCOM dataset
handler
implementation](https://fr.anckalbi.net/marisco/handlers/helcom.html).

Please note that this project is **still under development**.

We have implemented the [MARIS Legacy
handler](https://fr.anckalbi.net/marisco/handlers/maris_legacy.html) to
convert all existing datasets from the MARIS master database into NetCDF
format. For datasets that are frequently updated, such as
[HELCOM](https://fr.anckalbi.net/marisco/handlers/helcom.html),
[OSPAR](https://www.ospar.org/), and TEPCO/Fukushima-related datasets,
individual handlers are currently being developed and will be available
soon.

## Install

Now, to install `marisco` simply run

``` console
pip install marisco
```

Once successfully installed, run the following command:

``` console
maris_init
```

This command:

1.  creates a `.marisco/` directory containing various
    configuration/configurable files ((below)) in your `/home` directory
2.  creates a `configs.toml` file containing default but configurable
    settings (default paths, …)
3.  creates a configurable `cdl.toml` file used to generate a MARIS
    [NetCDF4 CDL (Common Data
    Language)](https://www.unidata.ucar.edu/software/netcdf/workshops/most-recent/nc3model/Cdl.html)
    template;
4.  downloads several MARIS DB nomenclature/lookup table into
    `.marisco/lut/` directory
5.  generate `maris-template.nc`, the MARIS NetCDF4 template generated
    from `cdl.toml` and use to encode MARIS datasets

> [!TIP]
>
> For inexperienced Python users, please refers to [How to setup
> `Marisco` with
> Anaconda](https://github.com/franckalbinet/marisco/tree/main/install_configure_guide/windows_anaconda)
> or [How to setup `Marisco` with Windows Subsystem for Linux (WSL) and
> Visual Studio Code
> editor](https://github.com/franckalbinet/marisco/tree/main/install_configure_guide//windows_ubuntu_sub_system).

### Zotero API key

Upon conversion, `marisco` will automatically retrieve the bibliographic
metadata of each MARIS dataset from [Zotero](https://www.zotero.org/).
To do so, you need to define the following environment variable
`ZOTERO_API_KEY` containing the MARIS Zotero API key. Please contact the
MARIS team to get your API key.

## Getting started

### Command line utilities

All commands accept a `-h` argument to get access to its documentation.

#### `maris_init`

Create configuration files, MARIS NetCDF CDL (Common Data Language) and
donwload required lookup tables (nomenclatures).

#### `maris_create_nc_template`

Generate MARIS NetCDF template to be used when encoding datasets

#### `maris_netcdfy`

Encode MARIS dataset as NetCDF

Positional arguments:

- `handler_name`: Handler’s name (e.g helcom, …)
- `str`: Path to dataset to encode
- `dest`: Path to converted NetCDF4

Example:

``` console
maris_netcdfy helcom _data/accdb/mors/csv _data/output/helcom.nc
```

## Development

### FAQ

#### How is `cdl.toml` created & what it is used for?

A Python dictionary named `CONFIGS_CDL` specifying MARIS NetCDF
attributes, variables, dimensions, … is defined in
`nbs/api/configs.ipynb` in the first instance. Running the command
`maris_init` will generate a [`toml`](https://www.wikiwand.com/fr/TOML)
version of it named `.marisco/cdl.toml` further used to create a MARIS
NetCDF template named in `.marisco/maris-template.nc`.

Once `marisco` installed, further customization of the MARIS NetCDF
template can be done directly through `.marisco/cdl.toml` file then
running the command `maris_create_nc_template`.
