Metadata-Version: 2.1
Name: econkit
Version: 0.4.1.9.8
Summary: Advanced Econometric Analysis Tools
Home-page: https://www.stefanstavrianos.eu/
Author: Stefanos Stavrianos
Author-email: contact@stefanstavrianos.eu
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Office/Business :: Financial
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE


# EconKit

`econkit` is a Python library that provides various statistical and econometric analysis tools, including descriptive statistics, correlation matrices, and tests for stationarity and autocorrelation.

## Installation

Ensure you have the required packages installed:
```bash
pip install pandas numpy scipy statsmodels tabulate yfinance pandas requests tabulate warnings
```

## Functions

### Descriptive Statistics

#### `descriptives(data)`

Computes descriptive statistics for each numeric column in a DataFrame.

**Parameters:**
- `data`: `pandas.DataFrame` containing the data to be analyzed.

**Returns:**
- None. Prints a summary table of the descriptive statistics.

**Example Usage:**
```python
import pandas as pd
from econkit import econometrics as ec

df = pd.read_csv('your_data.csv')
ec.descriptives(df)
```

### Correlation Matrix

#### `correlation(df, method='Pearson', p=False)`

Calculates and prints the correlation matrix and p-values for numeric columns in the provided DataFrame. Supports Pearson, Spearman, and Kendall correlation methods.

**Parameters:**
- `df`: `pandas.DataFrame` containing the data to be analyzed.
- `method`: `str` (optional). Method of correlation ('Pearson', 'Spearman', or 'Kendall'). Default is 'Pearson'.
- `p`: `bool` (optional). If True, p-value matrix is also printed; if False, only the correlation matrix is printed. Default is False.

**Returns:**
- None. Prints the correlation matrix and optionally the p-value matrix.

**Example Usage:**
```python
import pandas as pd
from econkit import econometrics as ec

df = pd.read_csv('your_data.csv')
ec.correlation(df, method='Spearman', p=True)
```

### Augmented Dickey-Fuller (ADF) Test

#### `adf(dataframe, maxlag=None, regression='c', autolag='AIC', handle_na='drop')`

Performs the ADF test on each column in the DataFrame and returns a summary table.

**Parameters:**
- `dataframe`: `pandas.DataFrame` containing the data to be tested.
- `maxlag`: `int` (optional). Maximum number of lags to use. Default is None.
- `regression`: `str` {'c', 'ct', 'ctt', 'nc'} (optional). Type of regression trend. Default is 'c'.
- `autolag`: `str` (optional). Method to use when automatically determining the lag length ('AIC', 'BIC', 't-stat'). Default is 'AIC'.
- `handle_na`: `str` {'drop', 'fill'} (optional). How to handle missing values. Default is 'drop'.

**Returns:**
- None. Prints a summary table of the ADF test results.

**Example Usage:**
```python
import pandas as pd
from econkit import econometrics as ec

df = pd.read_csv('your_data.csv')
ec.adf(df, regression='ct', autolag='BIC')
```

### KPSS Test

#### `kpss(dataframe, regression='c', nlags='auto', handle_na='drop')`

Performs the KPSS test on each column in the DataFrame and returns a summary table.

**Parameters:**
- `dataframe`: `pandas.DataFrame` containing the data to be tested.
- `regression`: `str` {'c', 'ct'} (optional). Type of regression trend. Default is 'c'.
- `nlags`: `str` or `int` (optional). Number of lags to use. Default is 'auto'.
- `handle_na`: `str` {'drop', 'fill'} (optional). How to handle missing values. Default is 'drop'.

**Returns:**
- None. Prints a summary table of the KPSS test results.

**Example Usage:**
```python
import pandas as pd
from econkit import econometrics as ec

df = pd.read_csv('your_data.csv')
ec.kpss(df, regression='ct', nlags='auto')
```

### Durbin-Watson Test

#### `dw(data)`

Performs the Durbin-Watson autocorrelation test and Ljung-Box test for each column of the dataset.

**Parameters:**
- `data`: `pandas.DataFrame` where each column is a time series.

**Returns:**
- None. Prints a summary table of the Durbin-Watson test results.

**Example Usage:**
```python
import pandas as pd
from econkit import econometrics as ec

df = pd.read_csv('your_data.csv')
ec.dw(df)
```

### Financial Data Retrieval

#### `data(ticker_symbol, start_date, end_date, interval)`

Downloads financial data from Yahoo Finance and calculates daily returns.

**Parameters:**
- `ticker_symbol`: `str`. The stock ticker symbol.
- `start_date`: `str`. Start date in 'dd-mm-yyyy' format.
- `end_date`: `str`. End date in 'dd-mm-yyyy' format.
- `interval`: `str`. Data interval (e.g., '1d', '1wk', '1mo').

**Returns:**
- `pandas.DataFrame` containing the stock data and calculated returns.

**Example Usage:**
```python
from econkit import finance as f

start = '01-06-2024'
end = '07-06-2024'
int = '1m'

SP500 = f.data('^GSPC', start, end, int)

SP500.head()
```

## Usage Notes

- Ensure your data is clean and properly formatted before using these functions.
- Some functions handle missing values; specify your preferred method using the `handle_na` parameter.
- For time series analysis, ensure your data is indexed by date.

For more details, refer to the function docstrings or the examples provided above.
