Metadata-Version: 2.4
Name: miRW
Version: 0.2.2
Summary: miRW: A Multi-Omics Random Walk Framework for Sample-Independent Construction of Personalized Protein Interaction Networks in Cancer
Author-email: "Zihao Chen; Dmitrij Frischman" <zihaochenn@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/bioUroZC/miRW
Project-URL: Issues, https://github.com/bioUroZC/miRW/issues
Keywords: Random Walk,bioinformatics,Multi-Omics,Protein Interaction Networks
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.21
Requires-Dist: pandas>=1.5
Requires-Dist: networkx>=2.8
Requires-Dist: scipy>=1.9
Dynamic: license-file

# miRW

**miRW: A Multi-Omics Random Walk Framework for Sample-Independent Construction of Personalized Protein Interaction Networks in Cancer**

miRW provides a computational framework for constructing personalized protein interaction networks by integrating multi-omics data and a modulation-adjusted Random Walk with Restart (RWR) algorithm. This tool enables sample-specific network refinement and supports downstream analysis in cancer and other complex diseases.

---

## Installation

```
pip install miRW

```

## Example Usage

Example datasets can be downloaded from:

    https://github.com/bioUroZC/miRW/tree/main/miRWsingleLayer/data_example

### Running miRW on Example Data

    import pandas as pd
    from miRW import single_RWSeed_prepare, single_RWSeed_analysis

    # === Step 1: Load example data ===
    links = pd.read_csv("Links.csv", index_col=0)
    expr = pd.read_csv("exprSet.csv", index_col=0)

    # CellMarkers is optional
    try:
        seeds = pd.read_csv("CellMarkers.csv")
    except FileNotFoundError:
        print("CellMarkers.csv not found; using the first gene as fallback seed.")
        seeds = pd.DataFrame({"symbol": [expr.index[0]]})

    # === Step 2: Choose a sample ID ===
    sample_id = expr.columns[0]
    print("Testing sample:", sample_id)

    # === Step 3: Prepare the protein interaction network ===
    G, real_w, expr2, seeds2 = single_RWSeed_prepare(seeds, links, expr)

    # === Step 4: Run Random Walk with Restart ===
    result = single_RWSeed_analysis(sample_id, G, real_w, expr2, seeds2)

    # === Step 5: Inspect the output ===
    print(result.head())
    print("\nmiRW test completed successfully!")

---

## Input Data Format

### 1. Protein Interaction Network (Links.csv)

Required columns:

| Column   | Description                           |
|----------|---------------------------------------|
| protein1 | First protein/gene in interaction     |
| protein2 | Second protein/gene                   |
| score    | Interaction confidence score          |

---

### 2. Expression Matrix (exprSet.csv)

- Rows → Genes
- Columns → Sample IDs
- Values → Expression levels (e.g., TPM, FPKM, normalized counts)

Example:

| Gene | Sample1 | Sample2 | Sample3 |
|------|---------|---------|---------|
| TP53 | 3.12    | 2.89    | 3.55    |
| EGFR | 1.22    | 0.98    | 1.34    |

---

### 3. Seed Gene List (CellMarkers.csv, optional)

Required column:

| Column | Description                             |
|--------|-----------------------------------------|
| symbol | Seed or marker genes used to initiate RWR |

If no seed file is provided, miRW automatically assigns the first gene in your expression matrix as the fallback seed.

---

## Output

The function `single_RWSeed_analysis` returns a DataFrame containing:

| Column   | Description                                  |
|----------|----------------------------------------------|
| Sample   | Sample ID                                    |
| link     | Sorted gene pair (e.g., GeneA_GeneB)         |
| miRW-Imp | Importance-adjusted interaction weight       |
| miRW-Flow| Flow-based score derived from RWR            |

These scores can be used for network-based biomarker discovery, pathway investigation, and integration with downstream multi-omics analyses.

---

## Features

- Modulation-adjusted edge weights using expression deviation
- Sparse matrix–optimized Random Walk with Restart
- Personalized network construction for each sample
- Outputs interpretable edge-level importance and flow scores

---

## License

This project is licensed under the MIT License.

---

## Project Repository

GitHub: https://github.com/bioUroZC/miRW  
PyPI: https://pypi.org/project/miRW/
