Metadata-Version: 2.4
Name: dora-singlecell
Version: 0.1.0
Summary: DORA: latent trajectory model for single-cell drug response (PyTorch).
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.20
Requires-Dist: scipy>=1.7
Requires-Dist: scikit-learn>=1.2
Requires-Dist: pandas>=1.3
Requires-Dist: torch>=1.12
Requires-Dist: scanpy>=1.9
Requires-Dist: joblib>=1.2
Requires-Dist: tqdm>=4.60
Requires-Dist: torchmetrics>=0.11
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"

# dora-singlecell

PyTorch implementation of **DORA**, a latent-trajectory model for single-cell drug response: drug and cell embeddings, dose response, and a gene decoder, with utilities for AnnData / perturbation-style datasets.

**PyPI package name:** `dora-singlecell`  
**Import name:** `dora`

## Installation

### From PyPI (after you publish)

```bash
pip install dora-singlecell
```

### From GitHub (before or instead of PyPI)


```bash
pip install git+https://github.com/LBiophyEvo/dora-singlecell.git@main
```

For a local editable install while developing:

```bash
git clone https://github.com/LBiophyEvo/dora-singlecell.git
cd dora-singlecell
pip install -e .
```

## Quick start

- Load datasets 
```python
from dora import CustomDataset_mask

# Example: load preprocessed data (paths must match your layout; see utils.dataset_selection)
# First load the adata, prepared dataset (arranged dose-response gene expression), drug features, cell features and defined dose trajectory) 
# For example, for Sci-Plex daatset 
dosages_standard = [0.0, 0.001, 0.01, 0.1, 1.0]
train_dataset = CustomDataset_mask(adata=adata, dataset=dataset_train, feature_dict_drug= feature_dict_drug, feature_dict_cell=feature_dict_cell, dosages_standard=dosages_standard)

```

- Build the model 
```python
from dora import DORA
dosage_len = len(dosages_standard)
hparam = {
            'lr': 1e-2,
            'wd': 4e-5,
            'dim_hid': 32,
            'dep_hid': 3,
            'nb_layer': 5,
            'n_drugs': 188,
            'n_cells': 3,
            'n_genes': dim_cell_feature,
            'dim_drug_feature': dim_drug_feature,
            'dim_cell_feature': dim_cell_feature, 
            'batch_size': 128,
            'max_epoch': 700,
            'device': device,
            'cell_dim_hid': 128,
            'module': 1,
            'drug_dose_f': False,
            'max_patience': 100,
            'last_layer': 'linear',
            'step_size_lr': 35,
            'batch_norm': True,
            'param_pen': 0,
        }
model = DORA(num_genes = hparam['n_genes'],
            num_drugs = hparam['n_drugs'],
            num_cells = hparam['n_cells'], 
            genes= genes,
            dosage_len = dosage_len, 
            hparam=hparam, 

            )

```

Training and evaluation helpers live in `dora.train`, `dora.eval`, `dora.get_latent_util`, and `dora.train_clf_test_adam`.

## Project layout

```
.
├── pyproject.toml      # package metadata & dependencies (name: dora-singlecell)
├── README.md
└── dora/               # importable Python package
    ├── __init__.py
    ├── model.py        # DORA, MLP, losses, dose modules
    ├── utils.py        # CustomDataset_mask, data loading
    ├── train.py        # train the model 
    ├── eval.py         # eval the model 
    ├── get_latent_util.py   # extract the embeddings 
    └── train_clf_test_adam.py # fine tune the model 
```

## Requirements

- Python ≥ 3.9  
- PyTorch, scanpy, scikit-learn, numpy, scipy, pandas, joblib, tqdm, torchmetrics (see `pyproject.toml` for versions).


## Citation

If you use this code in a publication, cite the associated paper (add reference when available).

## License

MIT. 
