Metadata-Version: 1.1
Name: Pyarser
Version: 0.1.0
Summary: A Nifty HTML Parser written in Python
Home-page: https://github.com/jweinst1/Pyarser
Author: Joshua Weinstein
Author-email: jweinst1@berkeley.edu
License: UNKNOWN
Download-URL: https://github.com/jweinst1/Pyrser/tarball/0.1
Description: Pyarser is a simple, straight forward HTML parser that allows you to easily harvest text
          inside an HTML document from a link to that website. Examples:
        
          get_site_HTML(link): returns a string of HTML content from a link
        
          get_site_text(link): returns a string of text from a link. This string has all the HTML tags <> removed, along
          with there contents.
        
          search_by_phrase(phrase, link): returns the fragments of text from a link that contain the continuous string phrase.
        
          search_for_words(words, link): returns the fragments of text from a link that contain ANY of the strings in words.
        
          word_count(link): counts the number of text words from a link.
        
          get_HTML_tags(link): returns a list of the tags used in an HTML document from a link.
        
          HTML_to_TXT(link, name): writes a TXT file with the text content from a link. All HTML brackets and tags are moved.
        
Keywords: data_science, web, data, harvesting
Platform: UNKNOWN
