Metadata-Version: 2.1
Name: midv500models
Version: 0.0.1
Summary: Document segmentation.
Home-page: https://github.com/ternaus/midv-500-models
Author: Vladimir Iglovikov
License: MIT
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Description-Content-Type: text/markdown
Requires-Dist: albumentations
Requires-Dist: iglovikov-helper-functions
Requires-Dist: pytorch-lightning
Requires-Dist: torch
Provides-Extra: test
Requires-Dist: pytest ; extra == 'test'

# midv-500-models
The repository contains a model for binary semantic segmentation of the documents.

![](https://habrastorage.org/webt/gy/-t/xn/gy-txnzezlnurcwwlv7q5vs77x4.jpeg)

* **Left**: input.
* **Center**: prediction.
* **Right**: overlay of the image and predicted mask.

For more details: [Example notebook](Example.ipynb)

## Dataset
Model is trained on [MIDV-500: A Dataset for Identity Documents Analysis and Recognition on Mobile Devices in Video Stream](https://arxiv.org/abs/1807.05786).

### Preparation

Download the dataset from the ftp server with
```bash
wget -r ftp://smartengines.com/midv-500/
```

Unpack the dataset
```bash
cd smartengines.com/midv-500/dataset/
unzip \*.zip
```

The resulting folder structure will be

```bash
smartengines.com
    midv-500
        dataset
            01_alb_id
                ground_truth
                    CA
                        CA01_01.tif
                    ...
                images
                    CA
                        CA01_01.json
                    ...
                ...
            ...
        ...
    ...
```

To preprocess the data use the script
```python
python midv500models/preprocess_data.py -i <input_folder> \
                                          -o <output_folder>
```

where `input_folder` corresponds to the file with the unpacked dataset and output folder will look as:

```bash
images
    CA01_01.jpg
    ...
masks
    CA01_01.png
```

target binary masks will have values \[0, 255\], where 0 is background and 255 is the document.

## Training

```bash
python midv500models/train.py -c midv500models/configs/2020-05-19.yaml \
                              -i <path to train>
```

## Inference

```bash
python midv500models/inference.py -c midv500models/configs/2020-05-19.yaml \
                                  -i <path to images> \
                                  -o <path to save preidctions>
                                  -w <path to weights>
```

## Example notebook
[Example notebook](Example.ipynb)

## Weights
Unet with Resnet34 backbone: [Config](midv500models/configs/2020-05-19.yaml) [Weights](Unet_Resnet34.pth)


