Metadata-Version: 2.1
Name: databuster
Version: 0.1.0
Summary: A comprehensive chemical compound analysis platform for drug discovery research
Author: DataBuster Team
Author-email: team@databuster.org
License: MIT License
        
        Copyright (c) 2024 DataBuster Team
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/username/databuster
Project-URL: Documentation, https://databuster.readthedocs.io/
Project-URL: Repository, https://github.com/username/databuster.git
Keywords: chemistry,drug-discovery,molecular-analysis,chemical-compounds,cheminformatics,databuster
Platform: unix
Platform: linux
Platform: osx
Platform: win32
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: build>=1.2.2.post1
Requires-Dist: cairosvg>=2.7.1
Requires-Dist: chembl-webresource-client>=0.10.9
Requires-Dist: deepchem>=2.8.0
Requires-Dist: molvs>=0.1.1
Requires-Dist: numpy<2
Requires-Dist: pandas>=2.0.0
Requires-Dist: plotly>=5.24.1
Requires-Dist: py3dmol>=2.4.0
Requires-Dist: rdkit>=2.2.3
Requires-Dist: scipy>=1.14.1
Requires-Dist: setuptools>=61.0
Requires-Dist: streamlit>=1.39.0
Requires-Dist: tdc>=0.1.0
Requires-Dist: twine>=6.0.1
Requires-Dist: wheel>=0.45.1
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: isort>=5.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"

# DataBuster

A sophisticated chemical compound analysis platform for drug discovery research and molecular data processing through advanced computational chemistry tools.

## Overview

DataBuster is a powerful platform designed to revolutionize drug discovery research through comprehensive chemical compound analysis. It combines modern web technologies with advanced cheminformatics capabilities to provide an intuitive and powerful analysis environment.

## Features

- Structure analysis
- Molecular descriptors calculation
- Duplicate detection
- Chirality analysis
- Salt detection
- Structure standardization
- Batch processing
- Command-line interface

## Installation

```bash
pip install databuster
```

## Usage

### Web Interface

Start the web interface:

```bash
streamlit run main.py
```

### Command Line Interface (CLI)

The tool provides a powerful command-line interface for batch processing and automation.

#### Basic Usage

1. Analyze compounds with all features:
```bash
python cli.py analyze input.csv --smiles-column "SMILES"
```

2. Run specific analysis types:
```bash
python cli.py analyze input.csv --smiles-column "SMILES" --analysis-types "Duplicate Detection" "Molecular Descriptors"
```

3. Export results to custom location:
```bash
python cli.py analyze input.csv --smiles-column "SMILES" --output results.csv
```

4. Process with specific batch size:
```bash
python cli.py analyze input.csv --smiles-column "SMILES" --batch-size 1000
```

#### CLI Options

```
usage: cli.py analyze [-h] --smiles-column SMILES_COLUMN
                     [--analysis-types ANALYSIS_TYPES [ANALYSIS_TYPES ...]]
                     [--output OUTPUT] [--batch-size BATCH_SIZE]
                     [--log-level {DEBUG,INFO,WARNING,ERROR}]
                     input_file

positional arguments:
  input_file            Input CSV file containing compound data

options:
  -h, --help           show this help message and exit
  --smiles-column SMILES_COLUMN
                      Name of the column containing SMILES strings
  --analysis-types ANALYSIS_TYPES [ANALYSIS_TYPES ...]
                      Types of analysis to perform
  --output OUTPUT      Output file path for analysis results
  --batch-size BATCH_SIZE
                      Number of compounds to process in each batch
  --log-level {DEBUG,INFO,WARNING,ERROR}
                      Set the logging level
```

## Development

### Setup Development Environment

1. Clone the repository:
```bash
git clone https://github.com/yourusername/chemical-analysis-tool.git
cd chemical-analysis-tool
```

2. Install development dependencies:
```bash
pip install -e ".[dev]"
```

### Running Tests

```bash
pytest tests/
```

## License

This project is licensed under the MIT License - see the LICENSE file for details.
