Metadata-Version: 2.4
Name: datasense-eda
Version: 0.1.0
Summary: Datasense: An Explainable EDA library for automated exploratory data analysis, outlier detection, feature importance, and visualization.
Home-page: https://github.com/Akash-Sare03/datasense
Author: Akash Sare
Author-email: Akash Sare <akashsare03@gmail.com>
License: MIT
Project-URL: Bug Tracker, https://github.com/Akash-Sare03/datasense/issues
Project-URL: Source Code, https://github.com/Akash-Sare03/datasense
Project-URL: Documentation, https://github.com/Akash-Sare03/datasense#readme
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Scientific/Engineering :: Visualization
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: pandas>=1.2.0
Requires-Dist: numpy>=1.19.0
Requires-Dist: scipy>=1.6.0
Requires-Dist: scikit-learn>=0.24.0
Requires-Dist: statsmodels>=0.12.0
Requires-Dist: seaborn>=0.11.0
Requires-Dist: matplotlib>=3.3.0
Requires-Dist: ipython>=7.0.0
Requires-Dist: tabulate
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# datasense 📊

[![Python Tests](https://github.com/Akash-Sare03/datasense/actions/workflows/python-test.yml/badge.svg)](https://github.com/Akash-Sare03/datasense/actions/workflows/python-test.yml)


A Python library for automated exploratory data analysis (EDA), data cleaning, and visualization. 
Built for beginners and analysts to quickly understand and preprocess datasets.

---

## ✨ Features

- **Dataset Summary**: Overview of shape, dtypes, missing values, duplicates.  
- **Missing Value Handling**: Detect and impute missing values (mean, median, mode, constant, drop).  
- **Outlier Detection**: Identify outliers using Z-score or IQR methods.  
- **Feature Importance**: Calculate and visualize feature importance for regression/classification.  
- **Time-Series Analysis**: Decomposition, rolling statistics, and trend detection.  
- **Visualizations**: Histograms, boxplots, count plots, correlation matrices, scatter plots, pair plots.  
- **Automated Recommendations**: Get actionable insights for data preprocessing.  

---

## 🚀 Why Datasense?

- ⚡ One-line automated EDA for quick dataset understanding.  
- 🧹 Built-in cleaning and preprocessing to save time.  
- 📊 Visual + tabular insights, ready for analysis or ML pipelines.  
- 🔧 Beginner-friendly but powerful enough for production workflows.  

---

## Installation

```bash
git clone https://github.com/Akash-Sare03/datasense.git
cd datasense
pip install -r requirements.txt

Or install from PyPI: 
pip install datasense
```
---

## Quick Start

```python
import pandas as pd
from datasense import analyze

df = pd.read_csv("your_data.csv")

# Run a full EDA
analyze(df, target_col="price")
```

---

## Usage Examples

### Basic EDA Report
```python
from datasense import analyze

analyze(df, target_col="price", outlier_method="zscore")
```

### Handle Missing Values
```python
from datasense import handle_missing_values

df_clean, report = handle_missing_values(df, method="mean")
```

### Feature Importance
```python
from datasense import feature_importance_calculate

fi_df, markdown_report, task_type = feature_importance_calculate(df, target_col="target", top_n=10)
print(markdown_report)
```

### Time-Series Analysis
```python
from datasense import analyze_timeseries

analyze_timeseries(df, date_col="date", target_col="sales", freq="D")
```

---

## Example Notebooks

See practical examples and full workflows in the included Jupyter notebooks:

- [Basic EDA Example](notebooks/Datasense_Library_Test_1.ipynb)
- [Time-Series Example](notebooks/Datasense_Library_Test_2.ipynb)

---

## API Reference

### Main Functions
- `analyze()`: Generate a full EDA report.
- `summarize_dataset()`: Dataset overview.
- `handle_missing_values()`: Impute or remove missing values.
- `detect_outliers()`: Find outliers using Z-score or IQR.
- `feature_importance_calculate()`: Compute feature importance.
- `analyze_timeseries()`: Decompose and plot time-series data.
- `visualize()`: Auto-generate plots for numeric/categorical features.

---

## Contributing

1. Fork the project.
2. Create a feature branch: `git checkout -b feature-name`
3. Commit changes: `git commit -m 'Add feature'`
4. Push to the branch: `git push origin feature-name`
5. Open a pull request.

---

## License

This project is licensed under the MIT License. See [LICENSE](LICENSE.txt) for details.

---

## Deployment

This package is available on PyPI. Install it via:

```bash
pip install datasense
```

Or deploy locally:

```bash
python setup.py sdist bdist_wheel
twine upload dist/*
```

---

## Support

If you have any questions or issues, please open an issue [here](https://github.com/Akash-Sare03/datasense/issues).

