Metadata-Version: 2.3
Name: datamarket
Version: 0.8.7
Summary: Utilities that integrate advanced scraping knowledge into just one library.
License: GPL-3.0-or-later
Author: DataMarket
Author-email: techsupport@datamarket.es
Requires-Python: >=3.9,<4.0
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Provides-Extra: aws
Provides-Extra: azure-storage-blob
Provides-Extra: boto3
Provides-Extra: chompjs
Provides-Extra: click
Provides-Extra: clickhouse-driver
Provides-Extra: datetime
Provides-Extra: demjson3
Provides-Extra: dnspython
Provides-Extra: drive
Provides-Extra: duckduckgo-search
Provides-Extra: fake-useragent
Provides-Extra: geoalchemy2
Provides-Extra: geopandas
Provides-Extra: geopy
Provides-Extra: google-api-python-client
Provides-Extra: google-auth-httplib2
Provides-Extra: google-auth-oauthlib
Provides-Extra: html2text
Provides-Extra: httpx
Provides-Extra: json5
Provides-Extra: lxml
Provides-Extra: nodriver
Provides-Extra: openpyxl
Provides-Extra: pandas
Provides-Extra: pandera
Provides-Extra: peerdb
Provides-Extra: pillow
Provides-Extra: playwright
Provides-Extra: playwright-stealth
Provides-Extra: proxy
Provides-Extra: pyarrow
Provides-Extra: pydrive2
Provides-Extra: pymupdf
Provides-Extra: pysocks
Provides-Extra: pyspark
Provides-Extra: pytest
Provides-Extra: rapidfuzz
Provides-Extra: retry
Provides-Extra: shapely
Provides-Extra: soda-core-mysql
Provides-Extra: soda-core-postgres
Provides-Extra: stem
Provides-Extra: tqdm
Provides-Extra: undetected-chromedriver
Provides-Extra: unidecode
Provides-Extra: xmltodict
Requires-Dist: SQLAlchemy (==2.0.36)
Requires-Dist: azure-storage-blob (==12.23.1) ; extra == "azure-storage-blob"
Requires-Dist: beautifulsoup4 (==4.12.3)
Requires-Dist: boto3 (==1.35.53) ; extra == "boto3" or extra == "aws" or extra == "peerdb"
Requires-Dist: chompjs (==1.3.0) ; extra == "chompjs"
Requires-Dist: click (==8.1.7) ; extra == "click"
Requires-Dist: clickhouse-driver (==0.2.9) ; extra == "clickhouse-driver" or extra == "peerdb"
Requires-Dist: croniter (==3.0.4)
Requires-Dist: datetime (==5.5) ; extra == "datetime"
Requires-Dist: demjson3 (==3.0.6) ; extra == "demjson3"
Requires-Dist: dnspython (==2.7.0) ; extra == "dnspython"
Requires-Dist: duckduckgo-search (==6.2.11b1) ; extra == "duckduckgo-search"
Requires-Dist: dynaconf (==3.2.6)
Requires-Dist: fake-useragent (==1.5.1) ; extra == "fake-useragent"
Requires-Dist: geoalchemy2 (==0.15.2) ; extra == "geoalchemy2"
Requires-Dist: geopandas (==1.0.1) ; extra == "geopandas"
Requires-Dist: geopy (==2.4.1) ; extra == "geopy"
Requires-Dist: google-api-python-client (==2.151.0) ; extra == "google-api-python-client"
Requires-Dist: google-auth-httplib2 (==0.2.0) ; extra == "google-auth-httplib2"
Requires-Dist: google-auth-oauthlib (==1.2.1) ; extra == "google-auth-oauthlib"
Requires-Dist: html2text (==2024.2.26) ; extra == "html2text"
Requires-Dist: httpx[http2] (==0.28.1) ; extra == "httpx"
Requires-Dist: jinja2 (==3.1.5)
Requires-Dist: json5 (==0.9.25) ; extra == "json5"
Requires-Dist: lxml[html-clean] (==5.3.0) ; extra == "lxml"
Requires-Dist: nodriver (==0.38.post1) ; extra == "nodriver"
Requires-Dist: openpyxl (==3.1.5) ; extra == "openpyxl"
Requires-Dist: pandas (==2.2.3) ; extra == "pandas"
Requires-Dist: pandera (==0.20.4) ; extra == "pandera"
Requires-Dist: pendulum (==3.0.0)
Requires-Dist: pillow (==11.0.0) ; extra == "pillow"
Requires-Dist: playwright (==1.47.0) ; extra == "playwright"
Requires-Dist: playwright-stealth (==1.0.6) ; extra == "playwright-stealth"
Requires-Dist: pre-commit (==4.0.1)
Requires-Dist: psycopg2-binary (==2.9.10)
Requires-Dist: pyarrow (==17.0.0) ; extra == "pyarrow"
Requires-Dist: pydrive2 (==1.20.0) ; extra == "pydrive2" or extra == "drive"
Requires-Dist: pymupdf (==1.24.13) ; extra == "pymupdf"
Requires-Dist: pysocks (==1.7.1) ; extra == "pysocks"
Requires-Dist: pyspark (==3.5.3) ; extra == "pyspark"
Requires-Dist: pytest (==8.3.3) ; extra == "pytest"
Requires-Dist: rapidfuzz (==3.10.1) ; extra == "rapidfuzz"
Requires-Dist: requests (==2.32.3)
Requires-Dist: retry (==0.9.2) ; extra == "retry"
Requires-Dist: shapely (==2.0.6) ; extra == "shapely"
Requires-Dist: soda-core-mysql (==3.4.4) ; extra == "soda-core-mysql"
Requires-Dist: soda-core-postgres (==3.4.1) ; extra == "soda-core-postgres"
Requires-Dist: stem (==1.8.2) ; extra == "stem" or extra == "proxy"
Requires-Dist: tenacity (==9.0.0)
Requires-Dist: tqdm (==4.66.6) ; extra == "tqdm"
Requires-Dist: typer (==0.12.5)
Requires-Dist: undetected-chromedriver (==3.5.5) ; extra == "undetected-chromedriver"
Requires-Dist: unidecode (==1.3.8) ; extra == "unidecode"
Requires-Dist: xmltodict (==0.14.2) ; extra == "xmltodict"
Project-URL: Documentation, https://github.com/Data-Market/datamarket
Project-URL: Homepage, https://datamarket.es
Project-URL: Repository, https://github.com/Data-Market/datamarket
Description-Content-Type: text/markdown

# DataMarket scraping core

------------------------------------------------------
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)


Utilities that integrate advance scraping knowledge into just one library.

## Installation

To install this library in your Python environment:

`pip install datamarket`

## Documentation

This library has built functionalities for the following topics:

- **Databases**: through sqlalchemy it allows to insert records and perform queries in any database.
- **Proxies**: wide range of functions to perform HTTP requests through custom proxies or the Tor network.
- **Tinybird**: a Python client for this popular API.
- **Drive**: functions to upload, delete or authenticate to Google Drive.
- **FTP**: functions to upload, delete or authenticate to an FTP, SFTP or FTPS server.
- **Selenium**: wrapper for the main Selenium functions.

