Metadata-Version: 2.4
Name: pypxml
Version: 3.2.0
Summary: A python library for parsing, converting and modifying PageXML files. 
Author-email: Janik Haitz <jahtz.dev@proton.me>
License: Apache 2.0
Project-URL: Repository, https://github.com/jahtz/pypxml
Keywords: PageXML,XML,OCR,optical character recognition
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: lxml~=5.3.1
Requires-Dist: rich_click~=1.8.8
Dynamic: license-file

# PyPXML
A python library for parsing, converting and modifying PageXML files.

## Setup
>[!NOTE]
>Python version `>=3.11`

### Install from PyPI
```shell
pip install pypxml
```

### Install from source
1. Clone repository: `git clone https://github.com/jahtz/pypxml`
2. Install package: `cd pypxml && pip install .`

## API
PyPXML provides a feature rich Python API for working with PageXML files.

Full [documentation](docs/DOCUMENTATION.md)

## CLI
```bash
pypxml --help
```
  
## ZPD
Developed at Centre for [Philology and Digitality](https://www.uni-wuerzburg.de/en/zpd/) (ZPD), [University of Würzburg](https://www.uni-wuerzburg.de/en/).
