Metadata-Version: 2.1
Name: kesh-utils
Version: 0.1.3
Summary: Kesh Utils for Data science/EDA/Data preparation 
Home-page: https://github.com/KeshavShetty/ds
Author: Keshav Shetty
Author-email: keshavshetty@gmail.com
License: MIT
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 3
Description-Content-Type: text/markdown

# Chart + Util = Chartil

During EDA/data preparation stage, I use few fixed chart types to analyse the relation among various features. 
Few are simple chart like univariate and some are complex 3D or even multiple features>3.

Over the period it became complex to maintain all relevant codes or repeat codes. 
Instead I developed a simple, single api to plot various type of relations which will hide all technical/code details from Data Science task and approch.

Using this approach I just need one api

from KUtils.eda import chartil

    chartil.plot(dataframe, [list of columns]) or
    chartil.plot(dataframe, [list of columns], {optional_settings})


Demo code:

# Load UCI Dataset. Download [From here](https://archive.ics.uci.edu/ml/datasets/Heart+Disease/)
heart_disease_df = pd.read_csv('../input/uci/heart.csv')

heart_disease_df['age_bin'] = pd.cut(heart_disease_df['age'], [0, 32, 40, 50, 60, 70, 100], labels=['<32', '33-40','41-50','51-60','61-70', '71+'])
heart_disease_df['sex'] = heart_disease_df['sex'].map({1:'Male', 0:'Female'})

# Heatmap
chartil.plot(heart_disease_df, heart_disease_df.columns) # Send all column names ![Heatmap Numerical] (https://raw.githubusercontent.com/KeshavShetty/ds/blob/master/Roughbook/misc_resources/heatmap1.png)
chartil.plot(heart_disease_df, heart_disease_df.columns, optional_settings={'include_categorical':True} ) ![Heatmap With categorical] (https://raw.githubusercontent.com/KeshavShetty/ds/blob/master/Roughbook/misc_resources/heatmap2.png)
chartil.plot(heart_disease_df, heart_disease_df.columns, optional_settings={'include_categorical':True, 'sort_by_column':'trestbps'} ) ![Heatmap With categorical and ordered by a column] (https://raw.githubusercontent.com/KeshavShetty/ds/blob/master/Roughbook/misc_resources/heatmap3.png)

# Uni-categorical          
chartil.plot(heart_disease_df, ['target']) # Barchart as count plot ![Uni Categorical] (https://raw.githubusercontent.com/KeshavShetty/ds/blob/master/Roughbook/misc_resources/uni_categorical.png)

# Uni-Continuous
chartil.plot(heart_disease_df, ['age']) # boxplot ![Uni boxplot] (https://raw.githubusercontent.com/KeshavShetty/ds/blob/master/Roughbook/misc_resources/uni_boxplot.png)
chartil.plot(heart_disease_df, ['age'], chart_type='barchart') # Force barchart on cntinuous by auto creating 10 equal bins ![Uni barchart_forced] (https://raw.githubusercontent.com/KeshavShetty/ds/blob/master/Roughbook/misc_resources/uni_barchart_forced.png)
chartil.plot(heart_disease_df, ['age'], chart_type='barchart', optional_settings={'no_of_bins':5}) # Create custom number of bins ![Uni uni_barchart_forced_custom_bin_size] (https://raw.githubusercontent.com/KeshavShetty/ds/blob/master/Roughbook/misc_resources/uni_barchart_forced_custom_bin_size.png)
chartil.plot(heart_disease_df, ['age'], chart_type='distplot') ![Uni distplot] (https://raw.githubusercontent.com/KeshavShetty/ds/blob/master/Roughbook/misc_resources/uni_distplot.png)

# Uni-categorical with optional_settings
chartil.plot(heart_disease_df, ['age_bin']) # Barchart as count plot
chartil.plot(heart_disease_df, ['age_bin'], optional_settings={'sort_by_value':True})
chartil.plot(heart_disease_df, ['age_bin'], optional_settings={'sort_by_value':True, 'limit_bars_count_to':5})

# Bi Category vs Category (& Univariate Segmented)
chartil.plot(heart_disease_df, ['sex', 'target'])
chartil.plot(heart_disease_df, ['sex', 'target'], chart_type='crosstab')
chartil.plot(heart_disease_df, ['sex', 'target'], chart_type='stacked_barchart')

# Bi Continuous vs Continuous
chartil.plot(heart_disease_df, ['chol', 'thalach']) # Scatter plot

# Bi Continuous vs Category
chartil.plot(heart_disease_df, ['thalach', 'sex']) # Grouped box plot (Segmented univariate)
chartil.plot(heart_disease_df, ['thalach', 'sex'], chart_type='distplot') # Distplot

# Multi 3 Continuous
chartil.plot(heart_disease_df, ['chol', 'thalach', 'trestbps']) # Colored 3D scatter plot

# Multi 3 Categorical
chartil.plot(heart_disease_df, ['age_bin', 'sex', 'target']) # Paired barchart

# Multi 2 Continuous, 1 Category
chartil.plot(heart_disease_df, ['chol', 'thalach', 'target']) # Scatter plot with colored groups 
![Grouped Scatter plot](https://raw.githubusercontent.com/KeshavShetty/ds/master/Roughbook/misc_resources/group_scatter_plot.png)

# Multi 1 Continuous, 2 Category
chartil.plot(heart_disease_df, ['thalach', 'sex', 'target']) # Grouped boxplot
chartil.plot(heart_disease_df, ['thalach', 'sex', 'target'], chart_type='violinplot') # Grouped violin plot

# Multi 3 Continuous, 1 category
chartil.plot(heart_disease_df, ['chol', 'thalach', 'trestbps', 'target']) # Group Color highlighted 3D plot

# Multi 3 Continuous, 2 category
chartil.plot(heart_disease_df, ['sex','cp','target','thalach','trestbps']) # Paired scatter plot



