Metadata-Version: 2.1
Name: treeffuser
Version: 0.1.3
Summary: Probabilistic predictions for tabular data, using diffusion models and decision trees.
Home-page: https://github.com/blei-lab/treeffuser
License: MIT
Author: Nicolas Beltran-Velez
Author-email: nb2838@columbia.edu
Requires-Python: >=3.9,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: einops (>=0.8.0,<0.9.0)
Requires-Dist: jaxtyping (>=0.2.19,<0.3.0)
Requires-Dist: lightgbm (>=4.3.0,<5.0.0)
Requires-Dist: ml-collections (>=0.1.1,<0.2.0)
Requires-Dist: numpy (>=1.24,<2.0)
Requires-Dist: scikit-learn (>=1.5.0,<2.0.0)
Requires-Dist: scipy (>=1.13.1,<2.0.0)
Requires-Dist: tqdm (>=4.66.4,<5.0.0)
Project-URL: Repository, https://github.com/blei-lab/treeffuser
Description-Content-Type: text/x-rst

====================
Treeffuser
====================

Treeffuser is an easy-to-use package for probabilistic prediction on tabular data with tree-based diffusion models.
Its goal is to estimate distributions of the form `p(y|x)` where `x` is a feature vector, `y` is a target vector
and the form of `p(y|x)` can be arbitrarily complex (e.g multimodal, heteroskedastic, non-gaussian, heavy-tailed, etc).

It is designed to adhere closely to the scikit-learn API and requires minimal user tuning.

Usage Example
-------------

Here's how you can use Treeffuser in your project:

.. code-block:: python

    from treeffuser import Treeffuser
    import numpy as np

    # (n_training, n_features), (n_training, n_targets)
    X, y = ...  # load your data
    # (n_test, n_features)
    X_test = ...  # load your test data

    # Estimate p(y|x) with a tree-based diffusion model
    model = Treeffuser()
    model.fit(X, y)

    # Draw samples y ~ p(y|x) for each test point
    # (n_samples, n_test, n_targets)
    y_samples = model.sample(X_test, n_samples=1000)

    # Compute downstream metrics
    mean = np.mean(y_samples, axis=0)
    std = np.std(y_samples, axis=0)
    median = np.median(y_samples, axis=0)
    quantile = np.quantile(y_samples, q=0 axis=0)
    ... # other metrics

Please refer to the docstrings for more information on the available methods and parameters.

Installation
============

You can install Treeffuser via pip from PyPI with the following command::

    pip install treeffuser

You can also install the in-development version with::

    pip install git+https://github.com/blei-lab/tree-diffuser.git@main

