Metadata-Version: 2.3
Name: mlcompare
Version: 1.0.0
Summary: Quickly compare machine learning models across libraries and datasets.
Project-URL: Homepage, https://github.com/MitchMedeiros/mlcompare
Author: Mitchell Medeiros
License: MIT License
        
        Copyright (c) 2024 Mitchell Medeiros
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE.txt
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: Operating System :: Unix
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Topic :: Scientific/Engineering
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: huggingface-hub>=0.21.0
Requires-Dist: kaggle>=1.0.0
Requires-Dist: lightgbm>=4.0.0
Requires-Dist: numpy<2.0.0,>=1.23.5
Requires-Dist: openml>=0.13.0
Requires-Dist: pandas[pyarrow]>=2.0.0
Requires-Dist: plotly>=4.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: scikit-learn>=1.5.1
Requires-Dist: torch>=2.0.0
Requires-Dist: xgboost>=2.1.0
Provides-Extra: dev
Requires-Dist: black>=24.0.0; extra == 'dev'
Requires-Dist: ipykernel>=6.0.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.5.0; extra == 'dev'
Provides-Extra: docs
Requires-Dist: matplotlib>=3.9.1; extra == 'docs'
Requires-Dist: pydata-sphinx-theme==0.15.4; extra == 'docs'
Requires-Dist: python-dotenv==1.0.1; extra == 'docs'
Requires-Dist: sphinx-autodoc-typehints==2.2.3; extra == 'docs'
Requires-Dist: sphinx-copybutton==0.5.2; extra == 'docs'
Requires-Dist: sphinx-design==0.6.0; extra == 'docs'
Requires-Dist: sphinx-sitemap==2.6.0; extra == 'docs'
Requires-Dist: sphinx==7.3.7; extra == 'docs'
Requires-Dist: sphinxawesome-theme==5.2.0; extra == 'docs'
Requires-Dist: sphinxext-opengraph==0.9.1; extra == 'docs'
Description-Content-Type: text/markdown

<p align="center">
    <img src="https://d1nheu3uhuz51e.cloudfront.net/mlcompare/logo_text_1k.png" width="425" alt="MLCompare Logo">
</p>

<div align="center">
<a href="https://pypi.org/project/mlcompare">
    <img alt="Supported Python Versions" src="https://img.shields.io/pypi/pyversions/Django?logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSIxLjAxZW0iIGhlaWdodD0iMWVtIiB2aWV3Qm94PSIwIDAgMjU2IDI1NSI%2BPGRlZnM%2BPGxpbmVhckdyYWRpZW50IGlkPSJsb2dvc1B5dGhvbjAiIHgxPSIxMi45NTklIiB4Mj0iNzkuNjM5JSIgeTE9IjEyLjAzOSUiIHkyPSI3OC4yMDElIj48c3RvcCBvZmZzZXQ9IjAlIiBzdG9wLWNvbG9yPSIjMzg3ZWI4Ii8%2BPHN0b3Agb2Zmc2V0PSIxMDAlIiBzdG9wLWNvbG9yPSIjMzY2OTk0Ii8%2BPC9saW5lYXJHcmFkaWVudD48bGluZWFyR3JhZGllbnQgaWQ9ImxvZ29zUHl0aG9uMSIgeDE9IjE5LjEyOCUiIHgyPSI5MC43NDIlIiB5MT0iMjAuNTc5JSIgeTI9Ijg4LjQyOSUiPjxzdG9wIG9mZnNldD0iMCUiIHN0b3AtY29sb3I9IiNmZmUwNTIiLz48c3RvcCBvZmZzZXQ9IjEwMCUiIHN0b3AtY29sb3I9IiNmZmMzMzEiLz48L2xpbmVhckdyYWRpZW50PjwvZGVmcz48cGF0aCBmaWxsPSJ1cmwoI2xvZ29zUHl0aG9uMCkiIGQ9Ik0xMjYuOTE2LjA3MmMtNjQuODMyIDAtNjAuNzg0IDI4LjExNS02MC43ODQgMjguMTE1bC4wNzIgMjkuMTI4aDYxLjg2OHY4Ljc0NUg0MS42MzFTLjE0NSA2MS4zNTUuMTQ1IDEyNi43N2MwIDY1LjQxNyAzNi4yMSA2My4wOTcgMzYuMjEgNjMuMDk3aDIxLjYxdi0zMC4zNTZzLTEuMTY1LTM2LjIxIDM1LjYzMi0zNi4yMWg2MS4zNjJzMzQuNDc1LjU1NyAzNC40NzUtMzMuMzE5VjMzLjk3UzE5NC42Ny4wNzIgMTI2LjkxNi4wNzJNOTIuODAyIDE5LjY2YTExLjEyIDExLjEyIDAgMCAxIDExLjEzIDExLjEzYTExLjEyIDExLjEyIDAgMCAxLTExLjEzIDExLjEzYTExLjEyIDExLjEyIDAgMCAxLTExLjEzLTExLjEzYTExLjEyIDExLjEyIDAgMCAxIDExLjEzLTExLjEzIi8%2BPHBhdGggZmlsbD0idXJsKCNsb2dvc1B5dGhvbjEpIiBkPSJNMTI4Ljc1NyAyNTQuMTI2YzY0LjgzMiAwIDYwLjc4NC0yOC4xMTUgNjAuNzg0LTI4LjExNWwtLjA3Mi0yOS4xMjdIMTI3LjZ2LTguNzQ1aDg2LjQ0MXM0MS40ODYgNC43MDUgNDEuNDg2LTYwLjcxMmMwLTY1LjQxNi0zNi4yMS02My4wOTYtMzYuMjEtNjMuMDk2aC0yMS42MXYzMC4zNTVzMS4xNjUgMzYuMjEtMzUuNjMyIDM2LjIxaC02MS4zNjJzLTM0LjQ3NS0uNTU3LTM0LjQ3NSAzMy4zMnY1Ni4wMTNzLTUuMjM1IDMzLjg5NyA2Mi41MTggMzMuODk3bTM0LjExNC0xOS41ODZhMTEuMTIgMTEuMTIgMCAwIDEtMTEuMTMtMTEuMTNhMTEuMTIgMTEuMTIgMCAwIDEgMTEuMTMtMTEuMTMxYTExLjEyIDExLjEyIDAgMCAxIDExLjEzIDExLjEzYTExLjEyIDExLjEyIDAgMCAxLTExLjEzIDExLjEzIi8%2BPC9zdmc%2B&labelColor=blue&color=yellow">
</a>
<!-- <a href="https://pypi.org/project/mlcompare/">
    <img alt="PyPI - Version" src="https://img.shields.io/pypi/v/polars?logo=pypi&label=PyPi&labelColor=white&color=blue">
</a> -->
<!-- <a href="https://pypi.org/project/mlcompare/">
    <img alt="PyPI - License" src="https://img.shields.io/pypi/l/polars?logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSIxZW0iIGhlaWdodD0iMWVtIiB2aWV3Qm94PSIwIDAgMTYgMTYiPjxwYXRoIGZpbGw9Im5vbmUiIHN0cm9rZT0iIzk3OTc5NyIgc3Ryb2tlLWxpbmVjYXA9InJvdW5kIiBzdHJva2UtbGluZWpvaW49InJvdW5kIiBkPSJNNC41IDEzLjVoN004LjAxIDF2MTIuMDZNMS41IDMuNWgzbDEuNS0xaDRsMS41IDFoM00uNSAxMEwzIDQuNDhMNS41IDEwQzQgMTEgMiAxMSAuNSAxMG0xMCAwTDEzIDQuNDhMMTUuNSAxMGMtMS41IDEtMy41IDEtNSAwIi8%2BPC9zdmc%2B&labelColor=darkred&color=lightgrey">
</a> -->
<!-- <a href="https://mlcompare.readthedocs.io/?badge=latest">
    <img alt="Read the Docs" src="https://img.shields.io/readthedocs/pillow?logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSIxZW0iIGhlaWdodD0iMWVtIiB2aWV3Qm94PSIwIDAgNTEyIDUxMiI%2BPHBhdGggZmlsbD0iI2Y5ZTdjMCIgZD0iTTQzNy41NjcgNTEySDg4LjAwNGE4LjE4MiA4LjE4MiAwIDAgMS04LjE4Mi04LjE4MlY4LjE4MkE4LjE4MiA4LjE4MiAwIDAgMSA4OC4wMDQgMEgyODguNzlsMTU2Ljk2IDE1Ni45NnYzNDYuODU4YTguMTgzIDguMTgzIDAgMCAxLTguMTgzIDguMTgyIi8%2BPHBhdGggZmlsbD0iI2VhYzA4MyIgZD0ibTI4OC43OSAwbDE1Ni45NiAxNTYuOTZIMzIyLjE1MmMtMTguNDI2IDAtMzMuMzYzLTE0LjkzNy0zMy4zNjMtMzMuMzYzVjB6Ii8%2BPHBhdGggZmlsbD0iIzU5N2I5MSIgZD0iTTIzNS4wNzggOTIuNDAxSDEyNi40NTNjLTYuMTQ3IDAtMTEuMTMtNC45ODMtMTEuMTMtMTEuMTNzNC45ODMtMTEuMTMgMTEuMTMtMTEuMTNoMTA4LjYyNWM2LjE0NyAwIDExLjEzIDQuOTgzIDExLjEzIDExLjEzcy00Ljk4MyAxMS4xMy0xMS4xMyAxMS4xM20xMS4xMyA2MS43MjNjMC02LjE0Ny00Ljk4My0xMS4xMy0xMS4xMy0xMS4xM0gxMjYuNDUzYy02LjE0NyAwLTExLjEzIDQuOTgzLTExLjEzIDExLjEzczQuOTgzIDExLjEzIDExLjEzIDExLjEzaDEwOC42MjVjNi4xNDcgMCAxMS4xMy00Ljk4MyAxMS4xMy0xMS4xM20wIDcyLjg1NGMwLTYuMTQ3LTQuOTgzLTExLjEzLTExLjEzLTExLjEzSDEyNi40NTNjLTYuMTQ3IDAtMTEuMTMgNC45ODMtMTEuMTMgMTEuMTNzNC45ODMgMTEuMTMgMTEuMTMgMTEuMTNoMTA4LjYyNWM2LjE0Ny0uMDAxIDExLjEzLTQuOTgzIDExLjEzLTExLjEzbTk0LjAzOCA3Mi44NTNjMC02LjE0Ni00Ljk4My0xMS4xMy0xMS4xMy0xMS4xM0gxMjYuNDUzYy02LjE0NyAwLTExLjEzIDQuOTgzLTExLjEzIDExLjEzczQuOTgzIDExLjEzIDExLjEzIDExLjEzaDIwMi42NjNjNi4xNDcgMCAxMS4xMy00Ljk4MyAxMS4xMy0xMS4xM20zNy40OTMtNzIuODUzYzAtNi4xNDctNC45ODMtMTEuMTMtMTEuMTMtMTEuMTNoLTc0Ljk4NWMtNi4xNDYgMC0xMS4xMyA0Ljk4My0xMS4xMyAxMS4xM3M0Ljk4MyAxMS4xMyAxMS4xMyAxMS4xM2g3NC45ODVjNi4xNDctLjAwMSAxMS4xMy00Ljk4MyAxMS4xMy0xMS4xM00yOTkuOTIgMzcyLjY4NWMwLTYuMTQ2LTQuOTgzLTExLjEzLTExLjEzLTExLjEzSDEyNi40NTNjLTYuMTQ3IDAtMTEuMTMgNC45ODMtMTEuMTMgMTEuMTNzNC45ODMgMTEuMTMgMTEuMTMgMTEuMTNIMjg4Ljc5YzYuMTQ3LS4wMDEgMTEuMTMtNC45ODQgMTEuMTMtMTEuMTNtNjYuMjEgNzIuODUzYzAtNi4xNDYtNC45ODMtMTEuMTMtMTEuMTMtMTEuMTNIMTI2LjQ1M2MtNi4xNDcgMC0xMS4xMyA0Ljk4My0xMS4xMyAxMS4xM3M0Ljk4MyAxMS4xMyAxMS4xMyAxMS4xM0gzNTVjNi4xNDYgMCAxMS4xMy00Ljk4MyAxMS4xMy0xMS4xMyIvPjwvc3ZnPg%3D%3D&labelColor=teal">
</a> -->
<!-- <a href="https://github.com/MitchMedeiros/MLCompare/actions/workflows/lint.yml">
    <img alt="GitHub Actions build status (Lint)" src="https://github.com/MitchMedeiros/MLCompare/workflows/Lint/badge.svg">
</a> -->
<!-- <a href="https://github.com/MitchMedeiros/MLCompare/actions/workflows/unit-tests.yml">
    <img alt="GitHub Actions build status (Test Linux and macOS)" src="https://github.com/MitchMedeiros/MLCompare/workflows/unit-tests/badge.svg">
</a> -->
</div>

<!-- 
<a href="https://github.com/MitchMedeiros/MLCompare/actions/workflows/test-docker.yml">
    <img alt="GitHub Actions build status (Test Docker)" src="https://github.com/MitchMedeiros/MLCompare/workflows/Test%20Docker/badge.svg">
</a>
<a href="https://app.codecov.io/gh/MitchMedeiros/MLCompare">
    <img alt="Code coverage" src="https://codecov.io/gh/MitchMedeiros/MLCompare/branch/main/graph/badge.svg">
</a> -->

<br>

<div align="center">
** This library is still in early developement. Expect many more features to come :D
</div>

<br>

MLCompare is a Python package for running model comparison pipelines, with the aim of being both simple and flexible. It supports multiple popular ML libraries, retrieval from multiple online dataset repositories, common data processing steps, and results visualization. Additionally, it allows for using your own models and datasets within the pipelines.

<table align="center">
    <tr>
        <th><div align="center">Libraries</div></th>
        <th><div align="center">Datasets</div></th>
        <th><div align="center">Data Processing</div></th>
    </tr>
    <tr>
        <td>
            <ul>
                <!-- <li>PyTorch</li> -->
                <li>Scikit-learn</li>
                <li>XGBoost</li>
                <!-- <li>LightGBM</li> -->
                <!-- <li>User defined models</li> -->
            </ul>
        </td>
        <td>
            <ul>
                <li>Kaggle</li>
                <li>Hugging Face</li>
                <li>OpenML</li>
                <!-- <li>S3</li> -->
                <li>Locally saved</li>
            </ul>
        </td>
        <td>
            <ul>
                <li><b>Encode</b>: One-hot | Label</li>
                <!-- <li><b>Regularize</b>: Standard | Min-Max</li> -->
                <!-- <li><b>NaNs</b>: Drop | ffill | bfill | Averaging</li> -->
                <li><b>Drop Columns</b></li>
            </ul>
        </td>
    </tr>
</table>

<h2>Installing</h2>

<!-- It is recommended to create a new virtual environment. Ex. with Conda:

```sh
    conda create -n compare_env python==3.11.9
    conda activate compare_env
```

Install this library with pip: -->

```console
    pip install mlcompare
```

Note that for MacOS, both XGBoost and LightGBM require `libomp`. It can be installed with <a href="https://brew.sh">Homebrew</a>:

```console
    brew install libomp
```

<h2>A Simple Example</h2>

Running a pipeline with multiple models and datasets is done by making list of dictionaries for each and providing them to a pipeline function.

The below example downloads a dataset from OpenML and Kaggle, one-hot encodes some of the columns in the Kaggle dataset, and trains and evaluates a Random Forest and XGBoost model on them.

```python
import mlcompare

datasets = [
    {
        "type": "openml",
        "id": 8,
        "target": "drinks",
    },
    {
        "type": "kaggle",
        "user": "gorororororo23",
        "dataset": "plant-growth-data-classification",
        "file": "plant_growth_data.csv",
        "target": "Growth_Milestone",
        "onehotEncode": ["Soil_Type", "Water_Frequency", "Fertilizer_Type"],
    }
]

models = [
    {
        "library": "sklearn",
        "name": "RandomForestRegressor",
    },
    {
        "library": "xgboost",
        "name": "XGBRegressor",
        "params": {"num_leaves": 40, "n_estimators": 200}
    }
]

mlcompare.full_pipeline(datasets, models)
```

In the case of the XGBoost model we passed in our own parameter values rather than using the defaults.
