Metadata-Version: 2.3
Name: s3pathlib
Version: 2.3.4
Summary: s3pathlib is the python package provides the Pythonic objective oriented programming (OOP) interface to manipulate AWS S3 object / directory. The api is similar to the pathlib standard library and very intuitive for human.
License: Apache-2.0
Author: Sanhe Hu
Author-email: husanhe@email.com
Maintainer: Sanhe Hu
Maintainer-email: husanhe@email.com
Requires-Python: >=3.9,<4.0
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Provides-Extra: auto
Provides-Extra: dev
Provides-Extra: doc
Provides-Extra: test
Requires-Dist: Sphinx (>=7.4.7,<8.0.0) ; extra == "doc"
Requires-Dist: boto3 (>=1.33.0,<2.0.0)
Requires-Dist: boto3-stubs[s3,sts] (>=1.33.0,<2.0.0) ; extra == "dev"
Requires-Dist: boto_session_manager (>=1.5.1,<2.0.0)
Requires-Dist: build (>=1.2.1,<2.0.0) ; extra == "dev"
Requires-Dist: decorator (>=5.0.5,<6.0.0) ; extra == "test"
Requires-Dist: docfly (==3.0.0) ; extra == "doc"
Requires-Dist: func_args (>=0.1.1,<2.0.0)
Requires-Dist: furo (==2024.8.6) ; extra == "doc"
Requires-Dist: ipython (>=8.18.1,<8.19.0) ; extra == "doc"
Requires-Dist: iterproxy (>=0.3.1,<1.0.0)
Requires-Dist: moto (>=5.0.28,<6.0.0) ; extra == "test"
Requires-Dist: nbsphinx (>=0.8.12,<1.0.0) ; extra == "doc"
Requires-Dist: pathlib_mate (>=1.0.1,<2.0.0)
Requires-Dist: pygments (>=2.18.0,<3.0.0) ; extra == "doc"
Requires-Dist: pytest (>=8.2.2,<9.0.0) ; extra == "test"
Requires-Dist: pytest-cov (>=6.0.0,<7.0.0) ; extra == "test"
Requires-Dist: rich (>=13.8.1,<14.0.0) ; extra == "dev"
Requires-Dist: rich (>=13.8.1,<14.0.0) ; extra == "test"
Requires-Dist: rstobj (==1.2.1) ; extra == "doc"
Requires-Dist: smart_open (>=5.1.0,<8.0.0)
Requires-Dist: sphinx-copybutton (>=0.5.2,<1.0.0) ; extra == "doc"
Requires-Dist: sphinx-design (>=0.6.1,<1.0.0) ; extra == "doc"
Requires-Dist: sphinx-jinja (>=2.0.2,<3.0.0) ; extra == "doc"
Requires-Dist: twine (>=6.0.0,<7.0.0) ; extra == "dev"
Requires-Dist: wheel (>=0.45.0,<1.0.0) ; extra == "dev"
Project-URL: Changelog, https://github.com/MacHu-GWU/s3pathlib-project/blob/main/release-history.rst
Project-URL: Documentation, https://s3pathlib.readthedocs.io/en/latest/
Project-URL: Download, https://pypi.org/pypi/s3pathlib#files
Project-URL: Homepage, https://github.com/MacHu-GWU/s3pathlib-project
Project-URL: Issues, https://github.com/MacHu-GWU/s3pathlib-project/issues
Project-URL: Repository, https://github.com/MacHu-GWU/s3pathlib-project
Description-Content-Type: text/x-rst

.. image:: https://readthedocs.org/projects/s3pathlib/badge/?version=latest
    :target: https://s3pathlib.readthedocs.io/en/latest/
    :alt: Documentation Status

.. image:: https://github.com/MacHu-GWU/s3pathlib-project/actions/workflows/main.yml/badge.svg
    :target: https://github.com/MacHu-GWU/s3pathlib-project/actions?query=workflow:CI

.. image:: https://codecov.io/gh/MacHu-GWU/s3pathlib-project/branch/main/graph/badge.svg
    :target: https://codecov.io/gh/MacHu-GWU/s3pathlib-project

.. image:: https://img.shields.io/pypi/v/s3pathlib.svg
    :target: https://pypi.python.org/pypi/s3pathlib

.. image:: https://img.shields.io/pypi/l/s3pathlib.svg
    :target: https://pypi.python.org/pypi/s3pathlib

.. image:: https://img.shields.io/pypi/pyversions/s3pathlib.svg
    :target: https://pypi.python.org/pypi/s3pathlib
    
.. image:: https://img.shields.io/pypi/dm/s3pathlib.svg
    :target: https://pypi.python.org/pypi/s3pathlib

.. image:: https://img.shields.io/badge/✍️_Release_History!--None.svg?style=social&logo=github
    :target: https://github.com/MacHu-GWU/s3pathlib-project/blob/main/release-history.rst

.. image:: https://img.shields.io/badge/⭐_Star_me_on_GitHub!--None.svg?style=social&logo=github
    :target: https://github.com/aws-samples/s3pathlib-project

------

.. image:: https://img.shields.io/badge/Link-API-blue.svg
    :target: https://s3pathlib.readthedocs.io/en/latest/py-modindex.html

.. image:: https://img.shields.io/badge/Link-Source_Code-blue.svg
    :target: https://s3pathlib.readthedocs.io/en/latest/py-modindex.html

.. image:: https://img.shields.io/badge/Link-Submit_Issue-blue.svg
    :target: https://github.com/aws-samples/s3pathlib-project/issues

.. image:: https://img.shields.io/badge/Link-Request_Feature-blue.svg
    :target: https://github.com/aws-samples/s3pathlib-project/issues

.. image:: https://img.shields.io/badge/Link-Download-blue.svg
    :target: https://pypi.org/pypi/s3pathlib#files


Welcome to ``s3pathlib`` Documentation
==============================================================================
`s3pathlib <https://s3pathlib.readthedocs.io/en/latest/>`_ is a Python package that offers an object-oriented programming (OOP) interface to work with AWS S3 objects and directories. Its API is designed to be similar to the standard library `pathlib <https://docs.python.org/3/library/pathlib.html>`_ and is user-friendly. The package also `supports versioning <https://docs.aws.amazon.com/AmazonS3/latest/userguide/Versioning.html>`_ in AWS S3.

.. note::

    You may not be viewing the full document, `FULL DOCUMENT IS HERE <https://s3pathlib.readthedocs.io/en/latest/>`_


Quick Start
------------------------------------------------------------------------------
.. note::

    `COMPREHENSIVE DOCUMENT guide / features / best practice can be found at HERE <https://s3pathlib.readthedocs.io/en/latest/#comprehensive-guide>`_

**Import the library, declare an S3Path object**

.. code-block:: python

    # import
    >>> from s3pathlib import S3Path

    # construct from string, auto join parts
    >>> p = S3Path("bucket", "folder", "file.txt")
    # construct from S3 URI works too
    >>> p = S3Path("s3://bucket/folder/file.txt")
    # construct from S3 ARN works too
    >>> p = S3Path("arn:aws:s3:::bucket/folder/file.txt")
    >>> p.bucket
    'bucket'
    >>> p.key
    'folder/file.txt'
    >>> p.uri
    's3://bucket/folder/file.txt'
    >>> p.console_url # click to preview it in AWS console
    'https://s3.console.aws.amazon.com/s3/object/bucket?prefix=folder/file.txt'
    >>> p.arn
    'arn:aws:s3:::bucket/folder/file.txt'

**Talk to AWS S3 and get some information**

.. code-block:: python

    # s3pathlib maintains a "context" object that holds the AWS authentication information
    # you just need to build your own boto session object and attach to it
    >>> import boto3
    >>> from s3pathlib import context
    >>> context.attach_boto_session(
    ...     boto3.session.Session(
    ...         region_name="us-east-1",
    ...         profile_name="my_aws_profile",
    ...     )
    ... )

    >>> p = S3Path("bucket", "folder", "file.txt")
    >>> p.write_text("a lot of data ...")
    >>> p.etag
    '3e20b77868d1a39a587e280b99cec4a8'
    >>> p.size
    56789000
    >>> p.size_for_human
    '51.16 MB'

    # folder works too, you just need to use a tailing "/" to identify that
    >>> p = S3Path("bucket", "datalake/")
    >>> p.count_objects()
    7164 # number of files under this prefix
    >>> p.calculate_total_size()
    (7164, 236483701963) # 7164 objects, 220.24 GB
    >>> p.calculate_total_size(for_human=True)
    (7164, '220.24 GB') # 7164 objects, 220.24 GB

**Manipulate Folder in S3**

Native S3 Write API (those operation that change the state of S3) only operate on object level. And the `list_objects <https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.list_objects_v2>`_ API returns 1000 objects at a time. You need additional effort to manipulate objects recursively. ``s3pathlib`` **CAN SAVE YOUR LIFE**

.. code-block:: python

    # create a S3 folder
    >>> p = S3Path("bucket", "github", "repos", "my-repo/")

    # upload all python file from /my-github-repo to s3://bucket/github/repos/my-repo/
    >>> p.upload_dir("/my-repo", pattern="**/*.py", overwrite=False)

    # copy entire s3 folder to another s3 folder
    >>> p2 = S3Path("bucket", "github", "repos", "another-repo/")
    >>> p1.copy_to(p2, overwrite=True)

    # delete all objects in the folder, recursively, to clean up your test bucket
    >>> p.delete()
    >>> p2.delete()

**S3 Path Filter**

Ever think of filter S3 object by it's attributes like: dirname, basename, file extension, etag, size, modified time? It is supposed to be simple in Python:

.. code-block:: python

    >>> s3bkt = S3Path("bucket") # assume you have a lots of files in this bucket
    >>> iterproxy = s3bkt.iter_objects().filter(
    ...     S3Path.size >= 10_000_000, S3Path.ext == ".csv" # add filter
    ... )

    >>> iterproxy.one() # fetch one
    S3Path('s3://bucket/larger-than-10MB-1.csv')

    >>> iterproxy.many(3) # fetch three
    [
        S3Path('s3://bucket/larger-than-10MB-1.csv'),
        S3Path('s3://bucket/larger-than-10MB-2.csv'),
        S3Path('s3://bucket/larger-than-10MB-3.csv'),
    ]

    >>> for p in iterproxy: # iter the rest
    ...     print(p)


**File Like Object for Simple IO**

``S3Path`` is file-like object. It support ``open`` and context manager syntax out of the box. Here are only some highlight examples:

.. code-block:: python

    # Stream big file by line
    >>> p = S3Path("bucket", "log.txt")
    >>> with p.open("r") as f:
    ...     for line in f:
    ...         do what every you want

    # JSON io
    >>> import json
    >>> p = S3Path("bucket", "config.json")
    >>> with p.open("w") as f:
    ...     json.dump({"password": "mypass"}, f)

    # pandas IO
    >>> import pandas as pd
    >>> p = S3Path("bucket", "dataset.csv")
    >>> df = pd.DataFrame(...)
    >>> with p.open("w") as f:
    ...     df.to_csv(f)

Now that you have a basic understanding of s3pathlib, let's read the `full document <https://s3pathlib.readthedocs.io/en/latest/#comprehensive-guide>`_ to explore its capabilities in greater depth.


Getting Help
------------------------------------------------------------------------------
Please use the ``python-s3pathlib`` tag on Stack Overflow to get help.

Submit a ``I want help`` issue tickets on `GitHub Issues <https://github.com/aws-samples/s3pathlib-project/issues/new/choose>`_


Contributing
------------------------------------------------------------------------------
Please see the `Contribution Guidelines <https://github.com/aws-samples/s3pathlib-project/blob/main/CONTRIBUTING.rst>`_.


Copyright
------------------------------------------------------------------------------
s3pathlib is an open source project. See the `LICENSE <https://github.com/aws-samples/s3pathlib-project/blob/main/LICENSE>`_ file for more information.

