Metadata-Version: 2.4
Name: ibis-framework
Version: 10.5.1.dev21
Summary: The portable Python dataframe library
Project-URL: Homepage, https://ibis-project.org
Project-URL: Chat, https://ibis-project.zulipchat.com
Project-URL: Repository, https://github.com/ibis-project/ibis
Project-URL: Documentation, https://ibis-project.org
Project-URL: Issues, https://github.com/ibis-project/ibis/issues
Project-URL: Changelog, https://ibis-project.org/release_notes
Author-email: Ibis Maintainers <maintainers@ibis-project.org>
Maintainer-email: Ibis Maintainers <maintainers@ibis-project.org>
License: Apache-2.0
License-File: LICENSE.txt
Keywords: bigquery,clickhouse,database,datafusion,duckdb,expressions,impala,mssql,mysql,pandas,polars,postgresql,pyarrow,pyspark,snowflake,sql,sqlite,trino
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: SQL
Classifier: Topic :: Database :: Front-Ends
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Software Development :: Code Generators
Classifier: Topic :: Software Development :: User Interfaces
Requires-Python: >=3.9
Requires-Dist: atpublic>=2.3
Requires-Dist: parsy>=2
Requires-Dist: python-dateutil>=2.8.2
Requires-Dist: sqlglot>=23.4
Requires-Dist: toolz>=0.11
Requires-Dist: typing-extensions>=4.3.0
Requires-Dist: tzdata>=2022.7
Provides-Extra: athena
Requires-Dist: fsspec[s3]; extra == 'athena'
Requires-Dist: numpy<3,>=1.23.2; extra == 'athena'
Requires-Dist: packaging>=21.3; extra == 'athena'
Requires-Dist: pandas<3,>=1.5.3; extra == 'athena'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'athena'
Requires-Dist: pyarrow>=10.0.1; extra == 'athena'
Requires-Dist: pyathena[arrow,pandas]>=3.11.0; extra == 'athena'
Requires-Dist: rich>=12.4.4; extra == 'athena'
Provides-Extra: bigquery
Requires-Dist: db-dtypes>=0.3; extra == 'bigquery'
Requires-Dist: google-cloud-bigquery-storage>=2; extra == 'bigquery'
Requires-Dist: google-cloud-bigquery>=3; extra == 'bigquery'
Requires-Dist: numpy<3,>=1.23.2; extra == 'bigquery'
Requires-Dist: pandas-gbq>=0.26.1; extra == 'bigquery'
Requires-Dist: pandas<3,>=1.5.3; extra == 'bigquery'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'bigquery'
Requires-Dist: pyarrow>=10.0.1; extra == 'bigquery'
Requires-Dist: pydata-google-auth>=1.4.0; extra == 'bigquery'
Requires-Dist: rich>=12.4.4; extra == 'bigquery'
Provides-Extra: clickhouse
Requires-Dist: clickhouse-connect[arrow,numpy,pandas]>=0.5.23; extra == 'clickhouse'
Requires-Dist: numpy<3,>=1.23.2; extra == 'clickhouse'
Requires-Dist: pandas<3,>=1.5.3; extra == 'clickhouse'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'clickhouse'
Requires-Dist: pyarrow>=10.0.1; extra == 'clickhouse'
Requires-Dist: rich>=12.4.4; extra == 'clickhouse'
Provides-Extra: databricks
Requires-Dist: databricks-sql-connector-core>=4; extra == 'databricks'
Requires-Dist: numpy<3,>=1.23.2; extra == 'databricks'
Requires-Dist: pandas<3,>=1.5.3; extra == 'databricks'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'databricks'
Requires-Dist: pyarrow>=10.0.1; extra == 'databricks'
Requires-Dist: rich>=12.4.4; extra == 'databricks'
Provides-Extra: datafusion
Requires-Dist: datafusion>=0.6; extra == 'datafusion'
Requires-Dist: numpy<3,>=1.23.2; extra == 'datafusion'
Requires-Dist: pandas<3,>=1.5.3; extra == 'datafusion'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'datafusion'
Requires-Dist: pyarrow>=10.0.1; extra == 'datafusion'
Requires-Dist: rich>=12.4.4; extra == 'datafusion'
Provides-Extra: decompiler
Requires-Dist: black>=22.1.0; extra == 'decompiler'
Provides-Extra: deltalake
Requires-Dist: deltalake>=0.9.0; extra == 'deltalake'
Provides-Extra: druid
Requires-Dist: numpy<3,>=1.23.2; extra == 'druid'
Requires-Dist: pandas<3,>=1.5.3; extra == 'druid'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'druid'
Requires-Dist: pyarrow>=10.0.1; extra == 'druid'
Requires-Dist: pydruid>=0.6.7; extra == 'druid'
Requires-Dist: rich>=12.4.4; extra == 'druid'
Provides-Extra: duckdb
Requires-Dist: duckdb>=0.10.3; extra == 'duckdb'
Requires-Dist: numpy<3,>=1.23.2; extra == 'duckdb'
Requires-Dist: packaging>=21.3; extra == 'duckdb'
Requires-Dist: pandas<3,>=1.5.3; extra == 'duckdb'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'duckdb'
Requires-Dist: pyarrow>=10.0.1; extra == 'duckdb'
Requires-Dist: rich>=12.4.4; extra == 'duckdb'
Provides-Extra: examples
Requires-Dist: pins[gcs]>=0.8.3; extra == 'examples'
Provides-Extra: exasol
Requires-Dist: numpy<3,>=1.23.2; extra == 'exasol'
Requires-Dist: pandas<3,>=1.5.3; extra == 'exasol'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'exasol'
Requires-Dist: pyarrow>=10.0.1; extra == 'exasol'
Requires-Dist: pyexasol>=0.25.2; extra == 'exasol'
Requires-Dist: rich>=12.4.4; extra == 'exasol'
Provides-Extra: flink
Requires-Dist: numpy<3,>=1.23.2; extra == 'flink'
Requires-Dist: pandas<3,>=1.5.3; extra == 'flink'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'flink'
Requires-Dist: pyarrow>=10.0.1; extra == 'flink'
Requires-Dist: rich>=12.4.4; extra == 'flink'
Provides-Extra: geospatial
Requires-Dist: geoarrow-types>=0.2; extra == 'geospatial'
Requires-Dist: geopandas>=0.6; extra == 'geospatial'
Requires-Dist: pyproj>=3.3.0; extra == 'geospatial'
Requires-Dist: shapely>=2; extra == 'geospatial'
Provides-Extra: impala
Requires-Dist: impyla>=0.17; extra == 'impala'
Requires-Dist: numpy<3,>=1.23.2; extra == 'impala'
Requires-Dist: pandas<3,>=1.5.3; extra == 'impala'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'impala'
Requires-Dist: pyarrow>=10.0.1; extra == 'impala'
Requires-Dist: rich>=12.4.4; extra == 'impala'
Provides-Extra: mssql
Requires-Dist: numpy<3,>=1.23.2; extra == 'mssql'
Requires-Dist: pandas<3,>=1.5.3; extra == 'mssql'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'mssql'
Requires-Dist: pyarrow>=10.0.1; extra == 'mssql'
Requires-Dist: pyodbc>=4.0.39; extra == 'mssql'
Requires-Dist: rich>=12.4.4; extra == 'mssql'
Provides-Extra: mysql
Requires-Dist: mysqlclient>=2.2.4; extra == 'mysql'
Requires-Dist: numpy<3,>=1.23.2; extra == 'mysql'
Requires-Dist: pandas<3,>=1.5.3; extra == 'mysql'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'mysql'
Requires-Dist: pyarrow>=10.0.1; extra == 'mysql'
Requires-Dist: rich>=12.4.4; extra == 'mysql'
Provides-Extra: oracle
Requires-Dist: numpy<3,>=1.23.2; extra == 'oracle'
Requires-Dist: oracledb>=1.3.1; extra == 'oracle'
Requires-Dist: packaging>=21.3; extra == 'oracle'
Requires-Dist: pandas<3,>=1.5.3; extra == 'oracle'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'oracle'
Requires-Dist: pyarrow>=10.0.1; extra == 'oracle'
Requires-Dist: rich>=12.4.4; extra == 'oracle'
Provides-Extra: polars
Requires-Dist: numpy<3,>=1.23.2; extra == 'polars'
Requires-Dist: packaging>=21.3; extra == 'polars'
Requires-Dist: pandas<3,>=1.5.3; extra == 'polars'
Requires-Dist: polars>=1; extra == 'polars'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'polars'
Requires-Dist: pyarrow>=10.0.1; extra == 'polars'
Requires-Dist: rich>=12.4.4; extra == 'polars'
Provides-Extra: postgres
Requires-Dist: numpy<3,>=1.23.2; extra == 'postgres'
Requires-Dist: pandas<3,>=1.5.3; extra == 'postgres'
Requires-Dist: psycopg[binary]>=3.2.0; extra == 'postgres'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'postgres'
Requires-Dist: pyarrow>=10.0.1; extra == 'postgres'
Requires-Dist: rich>=12.4.4; extra == 'postgres'
Provides-Extra: pyspark
Requires-Dist: numpy<3,>=1.23.2; extra == 'pyspark'
Requires-Dist: packaging>=21.3; extra == 'pyspark'
Requires-Dist: pandas<3,>=1.5.3; extra == 'pyspark'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'pyspark'
Requires-Dist: pyarrow>=10.0.1; extra == 'pyspark'
Requires-Dist: pyspark>=3.3.3; extra == 'pyspark'
Requires-Dist: rich>=12.4.4; extra == 'pyspark'
Provides-Extra: risingwave
Requires-Dist: numpy<3,>=1.23.2; extra == 'risingwave'
Requires-Dist: pandas<3,>=1.5.3; extra == 'risingwave'
Requires-Dist: psycopg2>=2.8.4; extra == 'risingwave'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'risingwave'
Requires-Dist: pyarrow>=10.0.1; extra == 'risingwave'
Requires-Dist: rich>=12.4.4; extra == 'risingwave'
Provides-Extra: snowflake
Requires-Dist: numpy<3,>=1.23.2; extra == 'snowflake'
Requires-Dist: pandas<3,>=1.5.3; extra == 'snowflake'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'snowflake'
Requires-Dist: pyarrow>=10.0.1; extra == 'snowflake'
Requires-Dist: rich>=12.4.4; extra == 'snowflake'
Requires-Dist: snowflake-connector-python!=3.3.0b1,>=3.0.2; extra == 'snowflake'
Provides-Extra: sqlite
Requires-Dist: numpy<3,>=1.23.2; extra == 'sqlite'
Requires-Dist: pandas<3,>=1.5.3; extra == 'sqlite'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'sqlite'
Requires-Dist: pyarrow>=10.0.1; extra == 'sqlite'
Requires-Dist: regex>=2021.7.6; extra == 'sqlite'
Requires-Dist: rich>=12.4.4; extra == 'sqlite'
Provides-Extra: trino
Requires-Dist: numpy<3,>=1.23.2; extra == 'trino'
Requires-Dist: pandas<3,>=1.5.3; extra == 'trino'
Requires-Dist: pyarrow-hotfix>=0.4; extra == 'trino'
Requires-Dist: pyarrow>=10.0.1; extra == 'trino'
Requires-Dist: rich>=12.4.4; extra == 'trino'
Requires-Dist: trino>=0.321; extra == 'trino'
Provides-Extra: visualization
Requires-Dist: graphviz>=0.16; extra == 'visualization'
Description-Content-Type: text/markdown

# Ibis

[![Documentation status](https://img.shields.io/badge/docs-docs.ibis--project.org-blue.svg)](http://ibis-project.org)
[![Project chat](https://img.shields.io/badge/zulip-join_chat-purple.svg?logo=zulip)](https://ibis-project.zulipchat.com)
[![Anaconda badge](https://anaconda.org/conda-forge/ibis-framework/badges/version.svg)](https://anaconda.org/conda-forge/ibis-framework)
[![PyPI](https://img.shields.io/pypi/v/ibis-framework.svg)](https://pypi.org/project/ibis-framework)
[![Build status](https://github.com/ibis-project/ibis/actions/workflows/ibis-main.yml/badge.svg)](https://github.com/ibis-project/ibis/actions/workflows/ibis-main.yml?query=branch%3Amain)
[![Build status](https://github.com/ibis-project/ibis/actions/workflows/ibis-backends.yml/badge.svg)](https://github.com/ibis-project/ibis/actions/workflows/ibis-backends.yml?query=branch%3Amain)
[![Codecov branch](https://img.shields.io/codecov/c/github/ibis-project/ibis/main.svg)](https://codecov.io/gh/ibis-project/ibis)

## What is Ibis?

Ibis is the portable Python dataframe library:

- Fast local dataframes (via DuckDB by default)
- Lazy dataframe expressions
- Interactive mode for iterative data exploration
- [Compose Python dataframe and SQL code](#python--sql-better-together)
- Use the same dataframe API for [nearly 20 backends](#backends)
- Iterate locally and deploy remotely by [changing a single line of code](#portability)

See the documentation on ["Why Ibis?"](https://ibis-project.org/why) to learn more.

## Getting started

You can `pip install` Ibis with a backend and example data:

```bash
pip install 'ibis-framework[duckdb,examples]'
```

> 💡 **Tip**
>
> See the [installation guide](https://ibis-project.org/install) for more installation options.

Then use Ibis:

```python
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species ┃ island    ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex    ┃ year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ string  │ string    │ float64        │ float64       │ int64             │ int64       │ string │ int64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie  │ Torgersen │           39.1 │          18.7 │               181 │        3750 │ male   │  2007 │
│ Adelie  │ Torgersen │           39.5 │          17.4 │               186 │        3800 │ female │  2007 │
│ Adelie  │ Torgersen │           40.3 │          18.0 │               195 │        3250 │ female │  2007 │
│ Adelie  │ Torgersen │           NULL │          NULL │              NULL │        NULL │ NULL   │  2007 │
│ Adelie  │ Torgersen │           36.7 │          19.3 │               193 │        3450 │ female │  2007 │
│ Adelie  │ Torgersen │           39.3 │          20.6 │               190 │        3650 │ male   │  2007 │
│ Adelie  │ Torgersen │           38.9 │          17.8 │               181 │        3625 │ female │  2007 │
│ Adelie  │ Torgersen │           39.2 │          19.6 │               195 │        4675 │ male   │  2007 │
│ Adelie  │ Torgersen │           34.1 │          18.1 │               193 │        3475 │ NULL   │  2007 │
│ Adelie  │ Torgersen │           42.0 │          20.2 │               190 │        4250 │ NULL   │  2007 │
│ …       │ …         │              … │             … │                 … │           … │ …      │     … │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘
>>> g = t.group_by("species", "island").agg(count=t.count()).order_by("count")
>>> g
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species   ┃ island    ┃ count ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ string    │ string    │ int64 │
├───────────┼───────────┼───────┤
│ Adelie    │ Biscoe    │    44 │
│ Adelie    │ Torgersen │    52 │
│ Adelie    │ Dream     │    56 │
│ Chinstrap │ Dream     │    68 │
│ Gentoo    │ Biscoe    │   124 │
└───────────┴───────────┴───────┘
```

> 💡 **Tip**
>
> See the [getting started tutorial](https://ibis-project.org/tutorials/basics) for a full introduction to Ibis.

## Python + SQL: better together

For most backends, Ibis works by compiling its dataframe expressions into SQL:

```python
>>> ibis.to_sql(g)
SELECT
  "t1"."species",
  "t1"."island",
  "t1"."count"
FROM (
  SELECT
    "t0"."species",
    "t0"."island",
    COUNT(*) AS "count"
  FROM "penguins" AS "t0"
  GROUP BY
    1,
    2
) AS "t1"
ORDER BY
  "t1"."count" ASC
```

You can mix SQL and Python code:

```python
>>> a = t.sql("SELECT species, island, count(*) AS count FROM penguins GROUP BY 1, 2")
>>> a
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species   ┃ island    ┃ count ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ string    │ string    │ int64 │
├───────────┼───────────┼───────┤
│ Adelie    │ Torgersen │    52 │
│ Adelie    │ Biscoe    │    44 │
│ Adelie    │ Dream     │    56 │
│ Gentoo    │ Biscoe    │   124 │
│ Chinstrap │ Dream     │    68 │
└───────────┴───────────┴───────┘
>>> b = a.order_by("count")
>>> b
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species   ┃ island    ┃ count ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ string    │ string    │ int64 │
├───────────┼───────────┼───────┤
│ Adelie    │ Biscoe    │    44 │
│ Adelie    │ Torgersen │    52 │
│ Adelie    │ Dream     │    56 │
│ Chinstrap │ Dream     │    68 │
│ Gentoo    │ Biscoe    │   124 │
└───────────┴───────────┴───────┘
```

This allows you to combine the flexibility of Python with the scale and performance of modern SQL.

## Backends

Ibis supports nearly 20 backends:

- [Apache DataFusion](https://ibis-project.org/backends/datafusion/)
- [Apache Druid](https://ibis-project.org/backends/druid/)
- [Apache Flink](https://ibis-project.org/backends/flink)
- [Apache Impala](https://ibis-project.org/backends/impala/)
- [Apache PySpark](https://ibis-project.org/backends/pyspark/)
- [BigQuery](https://ibis-project.org/backends/bigquery/)
- [ClickHouse](https://ibis-project.org/backends/clickhouse/)
- [DuckDB](https://ibis-project.org/backends/duckdb/)
- [Exasol](https://ibis-project.org/backends/exasol)
- [MySQL](https://ibis-project.org/backends/mysql/)
- [Oracle](https://ibis-project.org/backends/oracle/)
- [Polars](https://ibis-project.org/backends/polars/)
- [PostgreSQL](https://ibis-project.org/backends/postgresql/)
- [RisingWave](https://ibis-project.org/backends/risingwave/)
- [SQL Server](https://ibis-project.org/backends/mssql/)
- [SQLite](https://ibis-project.org/backends/sqlite/)
- [Snowflake](https://ibis-project.org/backends/snowflake)
- [Theseus](https://voltrondata.com/start)
- [Trino](https://ibis-project.org/backends/trino/)

## How it works

Most Python dataframes are tightly coupled to their execution engine. And many databases only support SQL, with no Python API. Ibis solves this problem by providing a common API for data manipulation in Python, and compiling that API into the backend’s native language. This means you can learn a single API and use it across any supported backend (execution engine).

Ibis broadly supports two types of backend:

1. SQL-generating backends
2. DataFrame-generating backends

![Ibis backend types](./docs/images/backends.png)

## Portability

To use different backends, you can set the backend Ibis uses:

```python
>>> ibis.set_backend("duckdb")
>>> ibis.set_backend("polars")
>>> ibis.set_backend("datafusion")
```

Typically, you'll create a connection object:

```python
>>> con = ibis.duckdb.connect()
>>> con = ibis.polars.connect()
>>> con = ibis.datafusion.connect()
```

And work with tables in that backend:

```python
>>> con.list_tables()
['penguins']
>>> t = con.table("penguins")
```

You can also read from common file formats like CSV or Apache Parquet:

```python
>>> t = con.read_csv("penguins.csv")
>>> t = con.read_parquet("penguins.parquet")
```

This allows you to iterate locally and deploy remotely by changing a single line of code.

> 💡 **Tip**
>
> Check out [the blog on backend agnostic arrays](https://ibis-project.org/posts/backend-agnostic-arrays/) for one example using the same code across DuckDB and BigQuery.

## Community and contributing

Ibis is an open source project and welcomes contributions from anyone in the community.

- Read [the contributing guide](https://github.com/ibis-project/ibis/blob/main/docs/CONTRIBUTING.md).
- We care about keeping the community welcoming for all. Check out [the code of conduct](https://github.com/ibis-project/ibis/blob/main/CODE_OF_CONDUCT.md).
- The Ibis project is open sourced under the [Apache License](https://github.com/ibis-project/ibis/blob/main/LICENSE.txt).

Join our community by interacting on GitHub or chatting with us on [Zulip](https://ibis-project.zulipchat.com/).

For more information visit https://ibis-project.org/.

## Governance

The Ibis project is an [independently governed](https://github.com/ibis-project/governance/blob/main/governance.md) open source community project to build and maintain the portable Python dataframe library. Ibis has contributors across a range of data companies and institutions.
