Metadata-Version: 2.1
Name: s3pathlib
Version: 2.2.1
Summary: Objective Oriented Interface for AWS S3, similar to pathlib.
Home-page: https://github.com/aws-samples/s3pathlib-project
Download-URL: https://pypi.python.org/pypi/s3pathlib/2.2.1#downloads
Author: Sanhe Hu
Author-email: husanhe@gmail.com
Maintainer: Sanhe Hu
Maintainer-email: sanhehu@amazon.com
License: Apache License 2.0
Platform: Windows
Platform: MacOS
Platform: Unix
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: MacOS
Classifier: Operating System :: Unix
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.7
License-File: LICENSE
License-File: NOTICE
License-File: AUTHORS.rst
Requires-Dist: boto3
Requires-Dist: iterproxy<1.0.0,>=0.2.1
Requires-Dist: func_args<1.0.0,>=0.1.1
Requires-Dist: pathlib_mate<2.0.0,>=1.0.1
Requires-Dist: boto_session_manager<2.0.0,>=1.5.1
Requires-Dist: smart_open<7.0.0,>=5.1.0
Requires-Dist: cached-property>=1.5.2; python_version < "3.8"
Provides-Extra: tests
Requires-Dist: pytest; extra == "tests"
Requires-Dist: pytest-cov; extra == "tests"
Requires-Dist: decorator; extra == "tests"
Requires-Dist: moto<5.0.0,>=4.2.14; extra == "tests"
Requires-Dist: rich; extra == "tests"
Provides-Extra: docs
Requires-Dist: sphinx; extra == "docs"
Requires-Dist: sphinx-inline-tabs; extra == "docs"
Requires-Dist: sphinx-jinja; extra == "docs"
Requires-Dist: sphinx-copybutton; extra == "docs"
Requires-Dist: sphinx_design; extra == "docs"
Requires-Dist: nbsphinx; extra == "docs"
Requires-Dist: readthedocs-sphinx-search; extra == "docs"
Requires-Dist: rstobj==0.0.7; extra == "docs"
Requires-Dist: docfly==1.0.2; extra == "docs"
Requires-Dist: furo; extra == "docs"
Requires-Dist: IPython; extra == "docs"

.. image:: https://readthedocs.org/projects/s3pathlib/badge/?version=latest
    :target: https://s3pathlib.readthedocs.io/en/latest/
    :alt: Documentation Status

.. image:: https://github.com/aws-samples/s3pathlib-project/workflows/CI/badge.svg
    :target: https://github.com/aws-samples/s3pathlib-project/actions?query=workflow:CI

.. image:: https://img.shields.io/badge/codecov-100%25-brightgreen
    :target: https://github.com/aws-samples/s3pathlib-project/actions?query=workflow:CI

.. image:: https://img.shields.io/pypi/v/s3pathlib.svg
    :target: https://pypi.python.org/pypi/s3pathlib

.. image:: https://img.shields.io/pypi/l/s3pathlib.svg
    :target: https://pypi.python.org/pypi/s3pathlib

.. image:: https://img.shields.io/pypi/pyversions/s3pathlib.svg
    :target: https://pypi.python.org/pypi/s3pathlib
    
.. image:: https://img.shields.io/pypi/dm/s3pathlib.svg
    :target: https://pypi.python.org/pypi/s3pathlib

.. image:: https://img.shields.io/badge/STAR_Me_on_GitHub!--None.svg?style=social
    :target: https://github.com/aws-samples/s3pathlib-project

------

.. image:: https://img.shields.io/badge/Link-Document-orange.svg
    :target: https://s3pathlib.readthedocs.io/en/latest/

.. image:: https://img.shields.io/badge/Link-API-blue.svg
    :target: https://s3pathlib.readthedocs.io/en/latest/py-modindex.html

.. image:: https://img.shields.io/badge/Link-Source_Code-blue.svg
    :target: https://s3pathlib.readthedocs.io/en/latest/py-modindex.html

.. image:: https://img.shields.io/badge/Link-Submit_Issue-blue.svg
    :target: https://github.com/aws-samples/s3pathlib-project/issues

.. image:: https://img.shields.io/badge/Link-Request_Feature-blue.svg
    :target: https://github.com/aws-samples/s3pathlib-project/issues

.. image:: https://img.shields.io/badge/Link-Download-blue.svg
    :target: https://pypi.org/pypi/s3pathlib#files


Welcome to ``s3pathlib`` Documentation
==============================================================================
`s3pathlib <https://s3pathlib.readthedocs.io/en/latest/>`_ is a Python package that offers an object-oriented programming (OOP) interface to work with AWS S3 objects and directories. Its API is designed to be similar to the standard library `pathlib <https://docs.python.org/3/library/pathlib.html>`_ and is user-friendly. The package also `supports versioning <https://docs.aws.amazon.com/AmazonS3/latest/userguide/Versioning.html>`_ in AWS S3.

.. note::

    You may not be viewing the full document, `FULL DOCUMENT IS HERE <https://s3pathlib.readthedocs.io/en/latest/>`_


Quick Start
------------------------------------------------------------------------------
.. note::

    `COMPREHENSIVE DOCUMENT guide / features / best practice can be found at HERE <https://s3pathlib.readthedocs.io/en/latest/#comprehensive-guide>`_

**Import the library, declare an S3Path object**

.. code-block:: python

    # import
    >>> from s3pathlib import S3Path

    # construct from string, auto join parts
    >>> p = S3Path("bucket", "folder", "file.txt")
    # construct from S3 URI works too
    >>> p = S3Path("s3://bucket/folder/file.txt")
    # construct from S3 ARN works too
    >>> p = S3Path("arn:aws:s3:::bucket/folder/file.txt")
    >>> p.bucket
    'bucket'
    >>> p.key
    'folder/file.txt'
    >>> p.uri
    's3://bucket/folder/file.txt'
    >>> p.console_url # click to preview it in AWS console
    'https://s3.console.aws.amazon.com/s3/object/bucket?prefix=folder/file.txt'
    >>> p.arn
    'arn:aws:s3:::bucket/folder/file.txt'

**Talk to AWS S3 and get some information**

.. code-block:: python

    # s3pathlib maintains a "context" object that holds the AWS authentication information
    # you just need to build your own boto session object and attach to it
    >>> import boto3
    >>> from s3pathlib import context
    >>> context.attach_boto_session(
    ...     boto3.session.Session(
    ...         region_name="us-east-1",
    ...         profile_name="my_aws_profile",
    ...     )
    ... )

    >>> p = S3Path("bucket", "folder", "file.txt")
    >>> p.write_text("a lot of data ...")
    >>> p.etag
    '3e20b77868d1a39a587e280b99cec4a8'
    >>> p.size
    56789000
    >>> p.size_for_human
    '51.16 MB'

    # folder works too, you just need to use a tailing "/" to identify that
    >>> p = S3Path("bucket", "datalake/")
    >>> p.count_objects()
    7164 # number of files under this prefix
    >>> p.calculate_total_size()
    (7164, 236483701963) # 7164 objects, 220.24 GB
    >>> p.calculate_total_size(for_human=True)
    (7164, '220.24 GB') # 7164 objects, 220.24 GB

**Manipulate Folder in S3**

Native S3 Write API (those operation that change the state of S3) only operate on object level. And the `list_objects <https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.list_objects_v2>`_ API returns 1000 objects at a time. You need additional effort to manipulate objects recursively. ``s3pathlib`` **CAN SAVE YOUR LIFE**

.. code-block:: python

    # create a S3 folder
    >>> p = S3Path("bucket", "github", "repos", "my-repo/")

    # upload all python file from /my-github-repo to s3://bucket/github/repos/my-repo/
    >>> p.upload_dir("/my-repo", pattern="**/*.py", overwrite=False)

    # copy entire s3 folder to another s3 folder
    >>> p2 = S3Path("bucket", "github", "repos", "another-repo/")
    >>> p1.copy_to(p2, overwrite=True)

    # delete all objects in the folder, recursively, to clean up your test bucket
    >>> p.delete()
    >>> p2.delete()

**S3 Path Filter**

Ever think of filter S3 object by it's attributes like: dirname, basename, file extension, etag, size, modified time? It is supposed to be simple in Python:

.. code-block:: python

    >>> s3bkt = S3Path("bucket") # assume you have a lots of files in this bucket
    >>> iterproxy = s3bkt.iter_objects().filter(
    ...     S3Path.size >= 10_000_000, S3Path.ext == ".csv" # add filter
    ... )

    >>> iterproxy.one() # fetch one
    S3Path('s3://bucket/larger-than-10MB-1.csv')

    >>> iterproxy.many(3) # fetch three
    [
        S3Path('s3://bucket/larger-than-10MB-1.csv'),
        S3Path('s3://bucket/larger-than-10MB-2.csv'),
        S3Path('s3://bucket/larger-than-10MB-3.csv'),
    ]

    >>> for p in iterproxy: # iter the rest
    ...     print(p)


**File Like Object for Simple IO**

``S3Path`` is file-like object. It support ``open`` and context manager syntax out of the box. Here are only some highlight examples:

.. code-block:: python

    # Stream big file by line
    >>> p = S3Path("bucket", "log.txt")
    >>> with p.open("r") as f:
    ...     for line in f:
    ...         do what every you want

    # JSON io
    >>> import json
    >>> p = S3Path("bucket", "config.json")
    >>> with p.open("w") as f:
    ...     json.dump({"password": "mypass"}, f)

    # pandas IO
    >>> import pandas as pd
    >>> p = S3Path("bucket", "dataset.csv")
    >>> df = pd.DataFrame(...)
    >>> with p.open("w") as f:
    ...     df.to_csv(f)

Now that you have a basic understanding of s3pathlib, let's read the `full document <https://s3pathlib.readthedocs.io/en/latest/#comprehensive-guide>`_ to explore its capabilities in greater depth.


Getting Help
------------------------------------------------------------------------------
Please use the ``python-s3pathlib`` tag on Stack Overflow to get help.

Submit a ``I want help`` issue tickets on `GitHub Issues <https://github.com/aws-samples/s3pathlib-project/issues/new/choose>`_


Contributing
------------------------------------------------------------------------------
Please see the `Contribution Guidelines <https://github.com/aws-samples/s3pathlib-project/blob/main/CONTRIBUTING.rst>`_.


Copyright
------------------------------------------------------------------------------
s3pathlib is an open source project. See the `LICENSE <https://github.com/aws-samples/s3pathlib-project/blob/main/LICENSE>`_ file for more information.
