Metadata-Version: 2.1
Name: datamarket
Version: 0.7.7
Summary: Utilities that integrate advanced scraping knowledge into just one library.
Home-page: https://datamarket.es
License: GPL-3.0-or-later
Author: DataMarket
Author-email: techsupport@datamarket.es
Requires-Python: >=3.9,<4.0
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Provides-Extra: aws
Provides-Extra: boto3
Provides-Extra: chompjs
Provides-Extra: click
Provides-Extra: clickhouse-driver
Provides-Extra: croniter
Provides-Extra: demjson3
Provides-Extra: drive
Provides-Extra: duckduckgo-search
Provides-Extra: fake-useragent
Provides-Extra: geopandas
Provides-Extra: geopy
Provides-Extra: html2text
Provides-Extra: json5
Provides-Extra: lxml
Provides-Extra: nodriver
Provides-Extra: pandas
Provides-Extra: pandera
Provides-Extra: peerdb
Provides-Extra: pendulum
Provides-Extra: pillow
Provides-Extra: playwright
Provides-Extra: playwright-stealth
Provides-Extra: proxy
Provides-Extra: pyarrow
Provides-Extra: pydrive2
Provides-Extra: pymupdf
Provides-Extra: pysocks
Provides-Extra: pyspark
Provides-Extra: pytest
Provides-Extra: rapidfuzz
Provides-Extra: retry
Provides-Extra: shapely
Provides-Extra: soda-core-postgres
Provides-Extra: stem
Provides-Extra: tqdm
Provides-Extra: undetected-chromedriver
Provides-Extra: unidecode
Provides-Extra: xmltodict
Requires-Dist: SQLAlchemy (==2.0.36)
Requires-Dist: beautifulsoup4 (==4.12.3)
Requires-Dist: boto3 (==1.35.53) ; extra == "boto3" or extra == "aws" or extra == "peerdb"
Requires-Dist: chompjs (==1.3.0) ; extra == "chompjs"
Requires-Dist: click (==8.1.7) ; extra == "click"
Requires-Dist: clickhouse-driver (==0.2.9) ; extra == "clickhouse-driver" or extra == "peerdb"
Requires-Dist: croniter (==3.0.4) ; extra == "croniter"
Requires-Dist: demjson3 (==3.0.6) ; extra == "demjson3"
Requires-Dist: duckduckgo-search (==6.3.3) ; extra == "duckduckgo-search"
Requires-Dist: fake-useragent (==1.5.1) ; extra == "fake-useragent"
Requires-Dist: geopandas (==1.0.1) ; extra == "geopandas"
Requires-Dist: geopy (==2.4.1) ; extra == "geopy"
Requires-Dist: html2text (==2024.2.26) ; extra == "html2text"
Requires-Dist: json5 (==0.9.25) ; extra == "json5"
Requires-Dist: lxml (==5.3.0) ; extra == "lxml"
Requires-Dist: nodriver (==0.37) ; extra == "nodriver"
Requires-Dist: pandas (==2.2.3) ; extra == "pandas"
Requires-Dist: pandera (==0.20.4) ; extra == "pandera"
Requires-Dist: pendulum (==3.0.0) ; extra == "pendulum"
Requires-Dist: pillow (==11.0.0) ; extra == "pillow"
Requires-Dist: playwright (==1.47.0) ; extra == "playwright"
Requires-Dist: playwright-stealth (==1.0.6) ; extra == "playwright-stealth"
Requires-Dist: pre-commit (==4.0.1)
Requires-Dist: psycopg2-binary (==2.9.10)
Requires-Dist: pyarrow (==17.0.0) ; extra == "pyarrow"
Requires-Dist: pydrive2 (==1.20.0) ; extra == "pydrive2" or extra == "drive"
Requires-Dist: pymupdf (==1.24.13) ; extra == "pymupdf"
Requires-Dist: pysocks (==1.7.1) ; extra == "pysocks"
Requires-Dist: pyspark (==3.5.3) ; extra == "pyspark"
Requires-Dist: pytest (==8.3.3) ; extra == "pytest"
Requires-Dist: rapidfuzz (==3.10.1) ; extra == "rapidfuzz"
Requires-Dist: requests (==2.32.3)
Requires-Dist: retry (==0.9.2) ; extra == "retry"
Requires-Dist: shapely (==2.0.6) ; extra == "shapely"
Requires-Dist: soda-core-postgres (==3.4.1) ; extra == "soda-core-postgres"
Requires-Dist: stem (==1.8.2) ; extra == "stem" or extra == "proxy"
Requires-Dist: tenacity (==9.0.0)
Requires-Dist: tqdm (==4.66.6) ; extra == "tqdm"
Requires-Dist: typer (==0.12.5)
Requires-Dist: undetected-chromedriver (==3.5.5) ; extra == "undetected-chromedriver"
Requires-Dist: unidecode (==1.3.8) ; extra == "unidecode"
Requires-Dist: xmltodict (==0.14.2) ; extra == "xmltodict"
Project-URL: Documentation, https://github.com/Data-Market/datamarket
Project-URL: Repository, https://github.com/Data-Market/datamarket
Description-Content-Type: text/markdown

# DataMarket scraping core

------------------------------------------------------
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)


Utilities that integrate advance scraping knowledge into just one library.

## Installation

To install this library in your Python environment:

`pip install datamarket`

## Documentation

This library has built functionalities for the following topics:

- **Databases**: through sqlalchemy it allows to insert records and perform queries in any database.
- **Proxies**: wide range of functions to perform HTTP requests through custom proxies or the Tor network.
- **Tinybird**: a Python client for this popular API.
- **Drive**: functions to upload, delete or authenticate to Google Drive.
- **FTP**: functions to upload, delete or authenticate to an FTP, SFTP or FTPS server.
- **Selenium**: wrapper for the main Selenium functions.

