Metadata-Version: 2.4
Name: any-document-extractor
Version: 0.1.1
Summary: A Python library for extracting text content from any document format.
Home-page: 
Author: yeqing
Author-email: 215777@qq.com
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: python-docx>=1.2.0
Requires-Dist: python-multipart>=0.0.20
Requires-Dist: openpyxl>=3.1.5
Requires-Dist: pdfminer.six
Requires-Dist: camelot-py>=1.0.9
Requires-Dist: python-pptx>=1.0.2
Dynamic: author
Dynamic: author-email
Dynamic: description
Dynamic: description-content-type
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Any document Extractor

A Python library for extracting text content from any document format.

## Features

- Supports multiple document formats (PPTX, DOCX, PDF, XLSX.)
- Returns clean extracted text

## Installation

```bash
pip install any-document-extractor
````



## Usage
Basic usage example:

```python

from anydocumentextractor import DocumentExtractor


def main(fp: str):
    extra = DocumentExtractor(fp)
    return extra.extract()


if __name__ == '__main__':
    fp = 'text.docx'  # Can be any supported document
    content = main(fp)
    print(content)

```

## Supported Formats
- Microsoft Office: PPTX, DOCX, XLSX
- OpenDocument: ODT, ODP
- PDF documents
- Plain text files
- And more...

## License
MIT License - Free for commercial and personal use.

You can customize this further by adding:
- More detailed installation instructions
- Specific version requirements
- Advanced usage examples
- Error handling documentation
- Contribution guidelines
- Project status badges

