Metadata-Version: 2.1
Name: dcd-mapping
Version: 0.1.0
Summary: Map MaveDB scoresets to VRS objects
Author-email: Alex Handler Wagner <Alex.Wagner@nationwidechildrens.org>, Jeremy Arbesfeld <Jeremy.Arbesfeld@nationwidechildrens.org>, Samriddhi Singh <todo@todo.org>, James Stevenson <James.Stevenson@nationwidechildrens.org>
License: MIT License
        
        Copyright (c) 2022-2024 Atlas of Variant Effects
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://www.varianteffect.org/
Project-URL: Documentation, https://github.com/ave-dcd/dcd_mapping/README.md
Project-URL: Source, https://github.com/ave-dcd/dcd_mapping
Classifier: Development Status :: 3 - Alpha
Classifier: Framework :: Pydantic
Classifier: Framework :: Pydantic :: 2
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests
Requires-Dist: biopython
Requires-Dist: tqdm
Requires-Dist: click
Requires-Dist: cool-seq-tool>=0.4.0.dev1
Requires-Dist: ga4gh.vrs~=2.0.0-a6
Requires-Dist: gene-normalizer>=0.3.0-dev1
Requires-Dist: pydantic>=2
Requires-Dist: python-dotenv
Requires-Dist: setuptools>=68.0
Provides-Extra: tests
Requires-Dist: pytest; extra == "tests"
Requires-Dist: pytest-mock; extra == "tests"
Requires-Dist: pytest-cov; extra == "tests"
Requires-Dist: pytest-asyncio; extra == "tests"
Requires-Dist: requests-mock; extra == "tests"
Provides-Extra: dev
Requires-Dist: ruff==0.2.0; extra == "dev"
Requires-Dist: pre-commit; extra == "dev"

# dcd-map: Map MaveDB data to computable and interoperable variant objects

[![image](https://img.shields.io/pypi/v/dcd_mapping.svg)](https://pypi.python.org/pypi/dcd_mapping)
[![image](https://img.shields.io/pypi/l/dcd_mapping.svg)](https://pypi.python.org/pypi/dcd_mapping)
[![image](https://img.shields.io/pypi/pyversions/dcd_mapping.svg)](https://pypi.python.org/pypi/dcd_mapping)
[![Actions status](https://github.com/ave-dcd/dcd_mapping/actions/workflows/checks.yaml/badge.svg)](https://github.com/ave-dcd/dcd_mapping/actions/checks.yaml)

<!-- description -->

This library implements a novel method for mapping [MaveDB scoreset data](https://mavedb.org/) to [GA4GH Variation Representation Specification (VRS)](https://vrs.ga4gh.org/en/stable/) objects, enhancing interoperability for genomic medicine applications. See [Arbesfeld et. al. (2023)](https://www.biorxiv.org/content/10.1101/2023.06.20.545702v1) for a preprint edition of the mapping manuscript, or [download the resulting mappings directly](https://mavedb-mapping.s3.us-east-2.amazonaws.com/mappings.tar.gz).

<!-- /description -->

## Installation

Install from [PyPI](https://pypi.python.org/pypi/dcd_mapping):

```
python3 -m pip install dcd-mapping
```

Also ensure the following data dependencies are available:

* Universal Transcript Archive (UTA): see [README](https://github.com/biocommons/uta?tab=readme-ov-file#installing-uta-locally) for setup instructions. Users with access to Docker on their local devices can use the available Docker image; otherwise, start a relatively recent (version 14+) PostgreSQL instance and add data from the available database dump.
* SeqRepo: see [README](https://github.com/biocommons/biocommons.seqrepo?tab=readme-ov-file#requirements) for setup instructions. The SeqRepo data directory must be writeable; see specific instructions [here](https://github.com/biocommons/biocommons.seqrepo/blob/main/docs/store.rst) for more.
* Gene Normalizer: see [documentation](https://gene-normalizer.readthedocs.io/0.3.0-dev1/install.html) for data setup instructions.
* blat: Must be available on the local PATH and executable by the user. Otherwise, its location can be set manually with the `BLAT_BIN_PATH` env var. See the [UCSC Genome Browser FAQ](https://genome.ucsc.edu/FAQ/FAQblat.html#blat3) for download instructions. For our experiments, we placed the binary in the same directory as these notebooks.

## Usage

Use the `dcd-map` command with a scoreset URN, eg

```shell
$ dcd-map urn:mavedb:00000083-c-1
```

Output is saved in the format `<URN>_mapping_results_<ISO datetime>.json` in the directory specified by the environment variable `MAVEDB_STORAGE_DIR`, or `~/.local/share/dcd-mapping` by default.

## Notebooks

Notebooks for manuscript data analysis and figure generation are provided within `notebooks/analysis`. See [`notebooks/analysis/README.md`](notebooks/analysis/README.md) for more information.

## Development

Clone the repo

```
git clone https://github.com/ave-dcd/dcd_mapping
cd dcd_mapping
```

Create and activate a virtual environment

```
python3 -m virtualenv venv
source venv/bin/activate
```

Install as editable and with developer dependencies

```
python3 -m pip install -e '.[dev,tests]'
```

Add pre-commit hooks

```
pre-commit install
```

Run tests with `pytest`

```
pytest
```
