Metadata-Version: 2.1
Name: webarchiver
Version: 0.19.0
Summary: Python tool that allows you to take multiple full page screenshots of web pages without ads.
Home-page: https://github.com/Knuckles-Team/webarchiver
Author: Audel Rouhi
Author-email: knucklessg1@gmail.com
License: Unlicense
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: Public Domain
Classifier: Environment :: Console
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: Pillow (>=9.3.0)
Requires-Dist: beautifulsoup4 (>=4.11.2)
Requires-Dist: piexif (>=1.1.3)
Requires-Dist: selenium (>=4.7.2)
Requires-Dist: webdriver-manager (>=3.8.5)

# Webarchiver
*Version: 0.19.0*

Python tool that allows you to take full page screenshots of pages without ads

Supports batching by adding multiple links in a text file, or my adding links to command line separated by commas.

### Requirements:

One of the following browsers:

- Chrome/Chromium browser
- Firefox
- Selenoid Server

### Usage:
| Short Flag | Long Flag    | Description                                                |
|------------|--------------|------------------------------------------------------------|
| -h         | --help       | See Usage                                                  |
| -b         | --browser    | Specify browser: Chrome / Firefox / Selenoid               |
| -c         | --clean      | Convert mobile sites to regular site                       |
| -d         | --directory  | Location where the images will be saved                    |
|            | --dpi        | DPI for the image                                          |
| -e         | --executor   | Execution environment: Local / Selenoid Host\|Selenoid URL |
| -f         | --file       | Text file to read the URL(s) from                          |
| -l         | --links      | Comma separated URL(s)                                     |
| -i         | --image-type | Save images as PNG or JPEG                                 |
| -p         | --processes  | Number of processes to run concurrently                    |
| -s         | --scrape     | Scrape URL(s) by Downloading                               |
| -u         | --url-filter | Filter URL(s) that contain this string                     |
| -z         | --zoom       | The zoom to use on the browser                             |


### Example:
```bash
webarchiver -c -f <links_file.txt> -l "<URL1,URL2,URL3>" -i 'jpeg' -d "~/Downloads" -z 100 --dpi 1 --browser "Firefox"
```

```bash
webarchiver -c -f <links_file.txt> -l "<URL1,URL2,URL3>" -i 'png' -d "~/Downloads" -z 100 --dpi 1 --executor "selenoid|http://selenoid.com/wd/hub" --browser "Chrome"
```

```bash
webarchiver -s -f <links_file.txt> -l "<URL1,URL2,URL3>"
```

#### Install Instructions
Install Python Package

```bash
python -m pip install webarchiver
```

#### Build Instructions
Build Python Package

```bash
sudo chmod +x ./*.py
pip install .
python setup.py bdist_wheel --universal
# Test Pypi
twine upload --repository-url https://test.pypi.org/legacy/ dist/* --verbose -u "Username" -p "Password"
# Prod Pypi
twine upload dist/* --verbose -u "Username" -p "Password"
```


