Metadata-Version: 2.4
Name: controlnet_dwpose
Version: 0.1.2
Summary: DWPose component from ControlNeXt for whole-body pose estimation
Home-page: https://github.com/kapong/controlnet_dwpose
Author: P.Phienphanich
Author-email: garpong@gmail.com
Project-URL: Bug Reports, https://github.com/kapong/controlnet_dwpose/issues
Project-URL: Source, https://github.com/kapong/controlnet_dwpose
Project-URL: Original ControlNeXt, https://github.com/dvlab-research/ControlNeXt
Project-URL: Original DWPose, https://github.com/IDEA-Research/DWPose
Keywords: pose-estimation controlnext controllable-generation onnx
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Image Processing
Classifier: Topic :: Multimedia :: Video
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: opencv-python
Requires-Dist: onnxruntime-gpu
Requires-Dist: torch
Requires-Dist: matplotlib
Requires-Dist: decord
Requires-Dist: tqdm
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: flake8; extra == "dev"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license-file
Dynamic: project-url
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# DWPose

[![PyPI version](https://badge.fury.io/py/controlnet-dwpose.svg)](https://badge.fury.io/py/controlnet-dwpose)
[![Build Status](https://github.com/kapong/controlnet_dwpose/workflows/Test%20Package/badge.svg)](https://github.com/kapong/controlnet_dwpose/actions)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

A standalone Python library for whole-body pose estimation extracted from the [ControlNeXt](https://github.com/dvlab-research/ControlNeXt) project.

> **Note**: This is a modified version of the DWPose implementation from the original ControlNeXt repository. The code has been restructured for standalone use and includes API improvements for better usability.

## ⚠️ AI Generation Disclaimer

This repository's packaging, documentation, and API improvements were generated with assistance from Claude AI. While the core DWPose algorithms remain unchanged from the original research implementations, the following components were created using AI assistance:

- Repository structure and packaging (setup.py, requirements.txt)
- Documentation (README.md, CLAUDE.md)
- API improvements and code organization
- Installation and usage examples

The original DWPose research and implementation credit belongs to the respective authors (Yang et al. and Peng et al.). This packaging is provided for educational and research purposes.

## Overview

DWPose provides dense keypoint detection for body, hands, and face using ONNX Runtime. It supports both CPU and GPU inference and is designed for efficient pose estimation in images and videos.

## Features

- **Whole-body pose estimation**: Detects body (18 keypoints), hands (21 keypoints each), and face (68 keypoints)
- **ONNX Runtime support**: Efficient inference with CPU and CUDA providers
- **Video processing**: Temporal consistency with pose rescaling across frames
- **Memory efficient**: Lazy loading and explicit memory management
- **Easy to use**: Simple API with minimal setup

## Installation

### From PyPI (when available)
```bash
pip install controlnet_dwpose
```

### From source
```bash
git clone https://github.com/kapong/controlnet_dwpose.git
cd controlnet_dwpose
pip install -e .
```

### Dependencies
Install required dependencies:
```bash
pip install -r requirement.txt
```

## Quick Start

### Basic Usage

```python
import cv2
import numpy as np
from dwpose import DWposeDetector

# Initialize detector
detector = DWposeDetector(
    model_det="yolox_l.onnx",
    model_pose="dw-ll_ucoco_384.onnx",
    device='cpu'  # or 'cuda'
)

# Load image
image = cv2.imread('example/02.jpeg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Detect poses
pose_result = detector(image)

# Access results
bodies = pose_result['bodies']
hands = pose_result['hands']
faces = pose_result['faces']

# Clean up memory when done
detector.release_memory()
```

### Processing Images

```python
from dwpose.preprocess import get_image_pose

# Get pose visualization
pose_image = get_image_pose(detector, image)
```

### Processing Videos

```python
from dwpose.preprocess import get_video_pose

# Process video with pose rescaling
pose_sequence = get_video_pose(
    dwprocessor=detector,
    video_path="path/to/video.mp4",
    ref_image=reference_image,
    sample_stride=1
)
```

## Model Requirements

You need to download the ONNX models:

1. **Detection model**: `yolox_l.onnx` - YOLOX-L for human detection
2. **Pose model**: `dw-ll_ucoco_384.onnx` - DWPose for keypoint estimation

### Download Models

Install gdown for downloading from Google Drive:
```bash
pip install gdown
```

Download the required models:
```python
import gdown

# Download pose estimation model
gdown.download('https://drive.google.com/uc?id=12L8E2oAgZy4VACGSK9RaZBZrfgx7VTA2', 'dw-ll_ucoco_384.onnx', quiet=False)

# Download detection model  
gdown.download('https://drive.google.com/uc?id=1w9pXC8tT0p9ndMN-CArp1__b2GbzewWI', 'yolox_l.onnx', quiet=False)
```

Or from command line:
```bash
gdown 'https://drive.google.com/uc?id=12L8E2oAgZy4VACGSK9RaZBZrfgx7VTA2' -O dw-ll_ucoco_384.onnx
gdown 'https://drive.google.com/uc?id=1w9pXC8tT0p9ndMN-CArp1__b2GbzewWI' -O yolox_l.onnx
```

## Output Format

The pose detection returns a dictionary with:

- `bodies`: Body keypoints with shape `(N, 18, 2)` for N detected persons
- `hands`: Hand keypoints with shape `(N, 42, 2)` (both hands combined)
- `faces`: Face keypoints with shape `(N, 68, 2)`
- `*_score`: Confidence scores for each keypoint type

Coordinates are normalized (0-1) relative to image dimensions.

## Requirements

- Python >= 3.7
- numpy
- opencv-python
- onnxruntime-gpu (or onnxruntime for CPU-only)
- torch
- matplotlib
- decord
- tqdm

## Attribution

This code is extracted and packaged from the ControlNeXt project:

- **ControlNeXt Repository**: [dvlab-research/ControlNeXt](https://github.com/dvlab-research/ControlNeXt)
- **Source Directory**: [ControlNeXt-SVD-v2/dwpose](https://github.com/dvlab-research/ControlNeXt/tree/main/ControlNeXt-SVD-v2/dwpose)
- **ControlNeXt Authors**: Bohao Peng, Jian Wang, Yuechen Zhang, Wenbo Li, Ming-Chang Yang, Jiaya Jia

### Original DWPose

The DWPose implementation used by Peng et al. in ControlNeXt is based on the original DWPose research:

- **Original DWPose Repository**: [IDEA-Research/DWPose](https://github.com/IDEA-Research/DWPose)
- **Original DWPose Authors**: Zhendong Yang, Ailing Zeng, Chun Yuan, Yu Li
- **Paper**: "Effective Whole-body Pose Estimation with Two-stages Distillation" (ICCV 2023)

### Modifications

This repository contains the following modifications from the original ControlNeXt implementation:

- **Standalone packaging**: Restructured as an independent Python package with proper setup.py
- **API improvements**: Enhanced function signatures for better usability
- **Documentation**: Added comprehensive README, examples, and installation instructions
- **Dependency management**: Proper requirements.txt and package dependencies
- **Model download**: Integrated Google Drive download links for pre-trained models
- **Memory management**: Improved memory cleanup and resource handling

### Citation

If you use this code, please cite both the ControlNeXt paper and the original DWPose paper:

```bibtex
@article{peng2024controlnext,
  title={ControlNeXt: Powerful and Efficient Control for Image and Video Generation},
  author={Peng, Bohao and Wang, Jian and Zhang, Yuechen and Li, Wenbo and Yang, Ming-Chang and Jia, Jiaya},
  journal={arXiv preprint arXiv:2408.06070},
  year={2024}
}

@inproceedings{yang2023effective,
  title={Effective Whole-body Pose Estimation with Two-stages Distillation},
  author={Yang, Zhendong and Zeng, Ailing and Yuan, Chun and Li, Yu},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={4210--4219},
  year={2023}
}
```

## License

This project is licensed under the Apache License 2.0 - see the original [ControlNeXt repository](https://github.com/dvlab-research/ControlNeXt) for details.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## Issues

If you encounter any problems, please open an issue on the [GitHub repository](https://github.com/your-username/dwpose/issues).
