Metadata-Version: 2.1
Name: pyhtmltext
Version: 0.1
Summary: Usefull tool for extracting text and sentences from html
Home-page: https://github.com/MaksimJames/pyhtmltext
Author: Maksim Prilepsky
Author-email: maksimprilepsky@yandex.ru
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE.md


# pyhtmltext

pyhtmltext is a usefull and flexible tool for extracting text from html.

# Help
See [documentation](docs/USAGE.md) for more details.

# Installation
```
  pip install pyhtmltext
```

# Simple usage
```
  from pyhtmltext import Extractor


  html_string = '''<h2 class="widget-title"><span aria-hidden="true" class="icon-get-started"></span>Getting Started</h2><p>Python can be easy to pick up whether you're a first time programmer or you're experienced with other languages. The following pages are a useful first step to get on your way writing programs with Python!</p>'''

  extractor = Extractor(html=html_string)

  # Extracting whole text from html with separator
  extractor.extract_text()
  #> "Getting Started|separator|Python can be easy to pick up whether you're a first time programmer or you're experienced with other languages. The following pages are a useful first step to get on your way writing programs with Python!"

  # Extracting sentences from html
  extractor.extract_sentences()
  #> ['Getting Started', "Python can be easy to pick up whether you're a first time programmer or you're experienced with other languages.", 'The following pages are a useful first step to get on your way writing programs with Python!']
```
