Metadata-Version: 2.4
Name: docx_parser
Version: 1.0.3
Summary: parse all contents of a docx file with python-docx
Home-page: https://github.com/suqingdong/docx_parser
Author: suqingdong
Author-email: suqingdong1114@gmail.com
License: BSD License
Classifier: Development Status :: 5 - Production/Stable
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Software Development :: Libraries
Description-Content-Type: text/markdown
Requires-Dist: click
Requires-Dist: pillow
Requires-Dist: python-docx
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: requires-dist
Dynamic: summary

![PyPI](https://img.shields.io/pypi/v/docx_parser)
![GitHub last commit](https://img.shields.io/github/last-commit/suqingdong/docx_parser)

## Parse all contents of a docx file with `python-docx`

### Installation
```bash
python3 -m pip install docx-parser
```

### Features:
- `paragraph`: text paragraph, with style_id
- `multipart`: paragraph with image or hyperlink
- `table`: table data with merged_cells

### Examples
- CMD
```bash
docx_parser --help

# parse image as file
docx_parser tests/demo.docx -D tests/media -o tests/out.file.jl

# parse image as base64 string
docx_parser tests/demo.docx -A base64 -o tests/out.base64.jl
```
- Python
```python
from docx_parser import DocumentParser

infile = 'tests/demo.docx'
doc = DocumentParser(infile)
for _type, item in doc.parse():
    print(_type, item)
```
---

### ToDo
- parse text style: color, bgcolor, font, bold, italic ...
- parse paragraph format
