Metadata-Version: 2.4
Name: quantnado
Version: 0.3.3
Summary: Dataset generation and peak calling for multi-modal Next-Generation Sequencing data
Author: Alastair Smith
Author-email: Catherine Chahrour <catherine.chahrour@imm.ox.ac.uk>
License-Expression: MIT
Requires-Python: <3.14,>=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: bamnado
Requires-Dist: crested
Requires-Dist: dask-ml<2027,>=2025
Requires-Dist: dask>=2026
Requires-Dist: icechunk>=1.1
Requires-Dist: loguru>=0.7
Requires-Dist: numpy<3.0,>=2.0
Requires-Dist: pandas<4.0,>=3.0
Requires-Dist: pyBigWig>=0.3
Requires-Dist: pyranges==0.1.4
Requires-Dist: pysam>=0.23
Requires-Dist: seaborn>=0.13
Requires-Dist: sparse>=0.18
Requires-Dist: tqdm>=4.67
Requires-Dist: typer>=0.24
Requires-Dist: xarray>=2026
Requires-Dist: zarr<4.0,>=3.0
Provides-Extra: dev
Requires-Dist: mkdocs-material; extra == "dev"
Requires-Dist: mkdocs<2.0,>=1.6; extra == "dev"
Requires-Dist: mkdocs-jupyter; extra == "dev"
Requires-Dist: mkdocstrings-python; extra == "dev"
Requires-Dist: mkdocstrings; extra == "dev"
Requires-Dist: pymdown-extensions; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Provides-Extra: example
Requires-Dist: ipykernel; extra == "example"
Requires-Dist: jupyterlab; extra == "example"
Requires-Dist: matplotlib; extra == "example"
Dynamic: license-file

# QuantNado

**QuantNado provides efficient Zarr-backed storage and analysis of genomic signal from BAM and bigWig files, with support for signal reduction, feature counting, dimensionality reduction, and quantile-based peak calling.**

[![CI](https://github.com/Milne-Group/QuantNado/actions/workflows/python-tests.yml/badge.svg)](https://github.com/Milne-Group/QuantNado/actions/workflows/python-tests.yml)
[![PyPI](https://img.shields.io/pypi/v/quantnado)](https://pypi.org/project/quantnado)
[![Docs](https://img.shields.io/badge/docs-milne--group.github.io-blue)](https://milne-group.github.io/QuantNado/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Python](https://img.shields.io/badge/python-3.12%20%7C%203.13-blue)](https://pypi.org/project/quantnado)

---

## Installation

```bash
pip install quantnado
```

Requires Python 3.12 or 3.13.

---

## Quick Start

### Create a dataset from BAM files

```python
from quantnado import QuantNado

qn = QuantNado.from_bam_files(
    bam_files=["sample1.bam", "sample2.bam", "sample3.bam"],
    store_path="dataset.zarr",
    metadata="samples.csv",  # optional
)
```

### Load and analyse an existing dataset

```python
from quantnado import QuantNado

qn = QuantNado.open("dataset.zarr")

# Aggregate signal over genomic ranges
promoter_signal = qn.reduce("promoters.bed", reduction="mean")
print(promoter_signal["mean"].shape)  # (n_promoters, n_samples)

# PCA on reduced signal
pca_obj, transformed = qn.pca(promoter_signal["mean"], n_components=10)
print(transformed.shape)  # (n_samples, 10)

# Generate a count matrix for DESeq2
counts, features = qn.feature_counts("genes.gtf", feature_type="gene")
counts.to_csv("counts.csv")

# Extract signal over a specific region
region = qn.extract_region("chr1:1000-5000")
print(region.shape)  # (n_samples, 4000)
```

---

## Command-line Interface

QuantNado installs a `quantnado` command with two subcommands.

### `create-dataset` — build a Zarr dataset from BAM files

```bash
quantnado create-dataset sample1.bam sample2.bam sample3.bam \
  --output dataset.zarr \
  --chromsizes hg38.chrom.sizes \
  --metadata samples.csv \
  --max-workers 8
```

### `call-peaks` — call quantile-based peaks from bigWig files

```bash
quantnado call-peaks \
  --bigwig-dir path/to/bigwigs/ \
  --output-dir peaks/ \
  --chromsizes hg38.chrom.sizes \
  --quantile 0.98
```

Run `quantnado --help` or `quantnado <subcommand> --help` for full option listings.

---

## API Reference

Full documentation is available at [milne-group.github.io/QuantNado](https://milne-group.github.io/QuantNado/).

### `QuantNado`

| Method / Property | Description |
|---|---|
| `QuantNado.from_bam_files(bam_files, store_path, ...)` | Create a new dataset from BAM files |
| `QuantNado.open(store_path, read_only=True)` | Open an existing dataset |
| `.reduce(ranges, reduction="mean")` | Aggregate signal over genomic ranges (BED) |
| `.feature_counts(gtf_file, feature_type="gene")` | Generate a DESeq2-compatible count matrix |
| `.pca(data, n_components=10)` | Run PCA on a signal matrix |
| `.extract_region(region)` | Extract raw signal for a genomic region |
| `.to_xarray(chromosomes)` | Load dataset as lazy xarray DataArrays |
| `.samples` | List of sample names |
| `.metadata` | Sample metadata (DataFrame) |
| `.chromosomes` | Available chromosome names |
| `.chromsizes` | Chromosome sizes (dict) |
| `.store_path` | Path to the underlying Zarr store |

---

## Requirements

| Dependency | Purpose |
|---|---|
| `zarr`, `icechunk` | Zarr v3 storage backend |
| `xarray`, `dask` | Lazy array operations |
| `pandas`, `numpy` | Data structures |
| `pysam`, `bamnado` | BAM file I/O |
| `pyBigWig` | bigWig I/O |
| `pyranges` | Genomic range operations |
| `scikit-learn` (via `dask-ml`) | PCA |
| `typer`, `loguru` | CLI and logging |
| `ipykernel`, `jupyterlab`, `matplotlib` | Example notebook (`pip install "quantnado[example]"`) |


---

## License

MIT — see [LICENSE](LICENSE).
