kedro.contrib.io¶
Description
This module contains functionality which we might consider moving into the
kedro.io module (e.g. additional AbstractDataSets and
extensions/alternative DataCatalogs.
Data catalog wrapper
kedro.contrib.io.catalog_with_default.DataCatalogWithDefault([…]) |
A DataCatalog with a default DataSet implementation for any data set which is not registered in the catalog. |
DataSets
kedro.contrib.io.azure.CSVBlobDataSet(…[, …]) |
CSVBlobDataSet loads and saves csv files in Microsoft’s Azure blob storage. |
kedro.contrib.io.azure.JSONBlobDataSet(…) |
JSONBlobDataSet loads and saves json(line-delimited) files in Microsoft’s Azure blob storage. |
kedro.contrib.io.bioinformatics.BioSequenceLocalDataSet(…) |
BioSequenceLocalDataSet loads and saves data to a sequence file. |
kedro.contrib.io.cached.CachedDataSet(dataset) |
“CachedDataSet is a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media. |
kedro.contrib.io.feather.FeatherLocalDataSet(…) |
FeatherLocalDataSet loads and saves data to a local feather file. |
kedro.contrib.io.matplotlib.MatplotlibWriter(…) |
MatplotlibWriter saves matplotlib objects as image files. |
kedro.contrib.io.parquet.ParquetS3DataSet(…) |
ParquetS3DataSet loads and saves data to a file in S3. |
kedro.contrib.io.pyspark.SparkDataSet(filepath) |
SparkDataSet loads and saves Spark data frames. |
kedro.contrib.io.pyspark.SparkJDBCDataSet(…) |
SparkJDBCDataSet loads data from a database table accessible via JDBC URL url and connection properties and saves the content of a PySpark DataFrame to an external database table via JDBC. |
kedro.contrib.io.yaml_local.YAMLLocalDataSet(…) |
YAMLLocalDataset loads and saves data to a local yaml file using PyYAML. |