Metadata-Version: 2.1
Name: deepsa
Version: 0.1.2
Summary: A Deep-learning Driven Predictor of Compound Synthesis Accessibility
Home-page: https://github.com/Shihang-Wang-58/DeepSA
Author: Shihang Wang
Author-email: Shihang Wang <wangshh12022@shanghaitech.edu.cn>
Project-URL: Homepage, https://github.com/Shihang-Wang-58/DeepSA
Project-URL: Bug Tracker, https://github.com/Shihang-Wang-58/DeepSA/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/x-rst
License-File: LICENSE

DeepSA: A Deep-learning Driven Predictor of Compound Synthesis Accessibility
============================================================================

With the continuous development of artificial intelligence technology, more and more deep-generation models are used for molecule generation. However, most new molecules generated by the generation models often face great challenges in terms of synthetic accessibility. 

DeepSA is a deep learning-based tool for predicting the synthetic accessibility of compounds, helping users evaluate the synthesis difficulty of molecules to select more easily synthesizable molecules for drug discovery and development. DeepSA has a much higher early enrichment rate in discriminating molecules that are difficult to synthesize. This helps users to select less expensive molecules for synthesis, thus reducing the time for drug discovery and development.

Installation
------------

Requirements
------------
Dependencies can be installed using the following command:

.. code-block:: bash

    conda create -n DeepSA python=3.12
    conda activate DeepSA
    # for gpu version
    pip3 install autogluon==1.2
    pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
    pip3 install rdkit

Install via pip
---------------

You can install DeepSA directly using pip:

.. code-block:: bash

    pip install deepsa

News
----

* 2024-12, because `AutoGluon stopped supporting python version 3.8 <https://github.com/autogluon/autogluon/pull/4512>`_ starting in October 2024. Therefore, we have updated DeepSA to use Python version 3.12 and updated the training and inference scripts to adapt to the latest version of AutoGluon, thanks for your interest in DeepSA!

* 2023-7, DeepSA_v1.0 has been released, welcome to provide feedback on the issue!

Data
----
The expand training and test datasets could be easily downloaded at https://drive.google.com/drive/folders/1iup6T3Bqyy-uvpdFyP0Of_WQqn-9l62h?usp=sharing

Usage
-----

Python Package Usage
--------------------

1. Predict synthetic accessibility of a single SMILES

.. code-block:: python

    from deepsa import predict_sa

    # Predict a single SMILES
    result = predict_sa("CCO")  # Ethanol
    print(f"Synthetic accessibility score: {result['SA_score']:.4f}")
    print(f"Heavy atom count: {result['HA_num']}")
    print(f"Ring count: {result['Ring_num']}")
    print(f"Ring system count: {result['RingSystem_num']}")
    print(f"Rule of five compliance: {result['rule_of_five']}")

2. Predict synthetic accessibility of multiple SMILES

.. code-block:: python

    import pandas as pd
    from deepsa import predict_sa_from_file

    # Create DataFrame containing SMILES
    smiles_list = ["CCO", "c1ccccc1", "CC(=O)OC1=CC=CC=C1C(=O)O"]
    df = pd.DataFrame({"smiles": smiles_list})

    # Predict and save results
    results = predict_sa_from_file(df, output_path="results.csv")
    print(results[["smiles", "easy", "hard"]])

3. Predict from CSV file

.. code-block:: python

    from deepsa import predict_sa_from_file

    # Predict from CSV file (file must contain smiles column)
    results = predict_sa_from_file("compounds.csv")

4. Command line usage

.. code-block:: bash

    # Predict a single SMILES
    deepsa-predict "CCO"

    # Predict SMILES from a CSV file
    deepsa-predict compounds.csv --output results.csv

Usage For Researchers
---------------------
If you want to train your own model, you can run it from the command line,

running:

.. code-block:: bash

    python DeepSA_training.py <dataset.csv/training.csv:test.csv> DeepSA_model ./data/test_set.list

If you want to use the model we proposed,

running:

.. code-block:: bash

    python DeepSA_predict.py <input_data.csv> DeepSA_model

Online Server
-------------

We deployed a pre-trained model on a dedicated server, which is publicly available at https://bailab.siais.shanghaitech.edu.cn/deepsa, to make it easy for biomedical researcher users to utilize DeepSA in their research activity. 

Users can upload their SMILES or csv files to the server, and then they can quickly obtain the predicted results.

Citation
--------
If you find this repository useful in your research, please consider citing our paper: 

Wang, S., Wang, L., Li, F. et al. DeepSA: a deep-learning driven predictor of compound synthesis accessibility. J Cheminform 15, 103 (2023). https://doi.org/10.1186/s13321-023-00771-3

Contact
-------
If you have any questions, please feel free to contact Shihang Wang (Email: wangshh12022@shanghaitech.edu.cn) or Lin Wang (Email: wanglin3@shanghaitech.edu.cn). 

Pull requests are highly welcomed!

Acknowledgments
---------------
We are grateful for the support from HPC Platform of ShanghaiTech University.

Thank you all for your attention to this work.
