Metadata-Version: 2.1
Name: ArrowTextClassifier
Version: 1.0.0
Summary: ArrowTextClassifier is a simple text classification tool written in pytorch that allows you to train, summarize, and use text classification models for various tasks.
Home-page: https://github.com/Bhargav230m/ArrowTextClassifier.git
Author: techpowerb
Author-email: technologypower24@gmail.com
Keywords: text classification,natural language processing,NLP,PyTorch,machine learning,deep learning,text summarization,preprocessing,data science,artificial intelligence,dataset,discord
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch==2.2.2
Requires-Dist: torchsummary==1.4.5
Requires-Dist: pandas==2.2.2
Requires-Dist: scikit-learn==1.4.2
Requires-Dist: tqdm==4.66.2
Requires-Dist: numpy==1.26.4

```markdown
# ArrowTextClassifier

ArrowTextClassifier is a Python package for text classification tasks, offering functionalities to train, summarize, and classify text using convolutional neural network (CNN) architecture.

## Installation

You can install ArrowTextClassifier via pip:

```bash
pip install ArrowTextClassifier
```

## How it Works

ArrowTextClassifier implements a convolutional neural network (CNN) architecture for text classification. It tokenizes input text, embeds the tokens, applies convolutional filters over the embedded tokens to extract features, and then classifies the text into predefined categories.

## Usage

### Training

To train a text classification model, you can utilize the `train_model` method provided by the `Model` class:

```python
from ArrowTextClassifier import Model

model = Model(name="your_model_name")
model.train_model(dataset)
```

#### How to make a dataset

To make your own custom dataset for training you need to create a parquet file with the following format:

*Example Parquet File*

```json
{"label":"normal","example":"Hey there!"}
{"label":"normal","example":"Hi!"}
{"label":"toxic","example":"You suck!"}
```

After you have created the parquet file with the data in the format above, you can provide to the dataset to start training the model.

### Summarization

To summarize a trained model, you can use the `summarize` method:

```python
model.summarize(
    model_path="path_to_your_model",
    hyperparams_path="path_to_hyperparameters_file",
    vocabulary_path="path_to_vocabulary_file",
    modelSummary_write_path="path_to_write_model_summary"
)
```

### Classification

For classifying text using the trained model:

```python
result = model.classify(
    model_path="path_to_your_model",
    hyperparams_path="path_to_hyperparameters_file",
    text="your_input_text",
    vocabulary_path="path_to_vocabulary_file"
)
print(result)
```

## Getting Started

This package provides tools for text classification tasks. You can explore and customize it according to your requirements. Refer to the documentation for detailed usage instructions.

## License

This project is licensed under the MIT License - see the LICENSE file for details.

---

## Contact

For any questions or feedback, please contact technologypower24@gmail.com or you can contact me at discord - techpowerb.
