Metadata-Version: 2.4
Name: stat-agent
Version: 0.1.8
Summary: STAT: Spatial Transcriptomics Analytical agenT - AI-powered platform for spatial omics analysis
Author-email: Yihang Chen <ychenlp@connect.ust.hk>
License-Expression: BSD-3-Clause
Project-URL: Homepage, https://github.com/chenyhvvvv/STAT-agent
Project-URL: Repository, https://github.com/chenyhvvvv/STAT-agent
Project-URL: Bug Tracker, https://github.com/chenyhvvvv/STAT-agent/issues
Keywords: spatial transcriptomics,single-cell,agent,AI,bioinformatics,anndata,scanpy
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.21
Requires-Dist: pandas>=1.3
Requires-Dist: anndata>=0.9
Requires-Dist: scipy>=1.7
Requires-Dist: scanpy>=1.9
Requires-Dist: geopandas>=0.10
Requires-Dist: shapely>=2.0
Requires-Dist: pillow>=9.0
Requires-Dist: tifffile>=2021.0
Requires-Dist: matplotlib>=3.5
Requires-Dist: seaborn>=0.11
Requires-Dist: flask>=3.0
Requires-Dist: anthropic>=0.18
Requires-Dist: openai>=1.12
Requires-Dist: pyyaml>=6.0
Provides-Extra: skills
Requires-Dist: squidpy>=1.2; extra == "skills"
Requires-Dist: gseapy>=1.0; extra == "skills"
Requires-Dist: liana; extra == "skills"
Requires-Dist: cellphonedb<6.0,>=5.0; extra == "skills"
Requires-Dist: plotnine>=0.12; extra == "skills"
Requires-Dist: harmonypy>=0.0.9; extra == "skills"
Requires-Dist: scvi-tools<2.0,>=1.0; extra == "skills"
Requires-Dist: qpsolvers>=4.0; extra == "skills"
Requires-Dist: ray>=2.0; extra == "skills"
Requires-Dist: NaiveDE>=0.1; extra == "skills"
Requires-Dist: SpatialDE>=1.1; extra == "skills"
Requires-Dist: scikit-misc>=0.5; extra == "skills"
Requires-Dist: igraph>=0.9; extra == "skills"
Requires-Dist: torch>=2.0; extra == "skills"
Requires-Dist: SpaGCN>=1.2.5; extra == "skills"
Requires-Dist: nrrd>=1.0; extra == "skills"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Dynamic: license-file

# STAT

**Spatial Transcriptomics Analytical agenT**

An AI-powered platform for spatial omics analysis with multi-format support, interactive visualization, and intelligent code generation.

## Features

- **AI Agent**: Natural language interface for spatial transcriptomics analysis — ask questions, get results
- **Multi-format support**: Single-slice, multi-slice, and multi-omics (gene + protein) datasets
- **Interactive viewer**: Canvas-based spatial visualization with zoom/pan, ROI drawing, and cell overlays
- **Skill system**: Extensible analysis skills (cell type annotation, deconvolution, spatial domains, etc.)
- **Code execution**: Agent generates and runs analysis code in a sandboxed environment
- **Multi-provider LLM**: Works with OpenAI, Anthropic, Google, Deepseek, and Poe

## Installation

```bash
pip install stat-agent
```

With all analysis skill dependencies (squidpy, scvi-tools, torch, liana, etc.):

```bash
pip install "stat-agent[skills]"
```

## Quick Start

### Web Interface

```bash
stat-web
# Open http://localhost:8889
```

Or with the startup script (includes Jupyter Lab):

```bash
./start_web.sh
```

### In the web UI:

1. Enter path to your dataset directory
2. Configure LLM (API key, model)
3. Click "Load Dataset"
4. Ask questions in the chat panel: *"Annotate cell types"*, *"Find spatially variable genes"*, *"Show BRCA1 expression"*

## Data Format

STAT auto-detects your data layout. Place files in a single directory:

**Single-slice:**
```
dataset/
├── tissue.h5ad          # Required: AnnData with x, y coordinates in obs
└── he.tif               # Optional: H&E image (pixel coords = cell coords)
```

**Multi-slice:**
```
dataset/
├── tissue_slice_0.h5ad
├── he_slice_0.tif
├── tissue_slice_1.h5ad
└── he_slice_1.tif
```

**Multi-omics:**
```
dataset/
├── tissue.h5ad          # Gene expression
├── tissue_protein.h5ad  # Protein expression
├── he.tif
└── protein_CD3.tif
```

**Key**: Cell coordinates `(x, y)` in `adata.obs` map directly to image pixels `(x, y)`. No coordinate transformation needed.

## Built-in Skills

| Skill | Description |
|-------|-------------|
| Cell Type Annotation (GPT) | Unsupervised clustering + LLM-based annotation |
| Cell Type Annotation (scANVI) | Transfer learning from scRNA-seq reference |
| Deconvolution (RCTD) | Spot-level cell type deconvolution |
| Spatial Domains (SpaGCN) | Graph-based spatial domain identification |
| SVG (SpatialDE) | Spatially variable gene detection |
| Neighborhood Enrichment | Cell type co-localization analysis |
| Cell Communication (LIANA+) | Ligand-receptor interaction analysis |
| Cell Communication (CellPhoneDB) | Permutation-based interaction testing |
| GO Enrichment | Gene Ontology pathway analysis |
| Niche Detection (Harmonics) | Spatial niche identification |
| Integration (Harmony) | Multi-slice batch correction |
| Alignment (STalign) | Spatial slice alignment |

## Architecture

```
User Query → QueryPlanner → SkillFilter → LLM Matching → SkillVerifier → Code Generation → Execution
```

- **QueryPlanner**: Determines target slices, breaks complex queries into steps
- **SkillFilter**: Programmatic filtering by modality, data level, number of slices
- **SkillVerifier**: Checks prerequisites, requests missing information
- **SpatialAgent**: Generates analysis code using skill instructions + session context
- **CodeExecutor**: Sandboxed execution with state change detection

## Project Structure

```
stat_agent/
├── core/                  # Data layer
│   ├── session.py         # Multi-slice/multi-omics session
│   ├── data_slice.py      # Single data slice wrapper
│   └── roi_manager.py     # ROI geometry management
├── agent/                 # Agent pipeline
│   ├── spatial_agent_core.py
│   ├── conversation_orchestrator.py
│   ├── pipeline_executor.py
│   ├── query_planner.py
│   ├── skill_registry.py
│   ├── skill_filter.py
│   ├── skill_verifier.py
│   ├── llm_backend.py
│   └── memory.py
└── functions/
    └── io.py              # Data loading
.claude/skills/            # Skill definitions (SKILL.md + helper libs)
web_interface.py           # Flask backend + API endpoints
static/                    # Frontend (JS + CSS)
templates/                 # HTML templates
```

## License

BSD-3-Clause
