Metadata-Version: 2.1
Name: data-toolkit
Version: 0.6.5
Summary: ML & data helper code!
Home-page: https://github.com/johndoe/myapp/
Author: Jakub Langr
Author-email: james.langr@gmail.com
License: (c) Jakub Langr
Platform: UNKNOWN
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: python-Levenshtein
Requires-Dist: github3.py
Requires-Dist: cement
Requires-Dist: humanize
Requires-Dist: jinja2
Requires-Dist: pyyaml
Requires-Dist: colorlog
Requires-Dist: gputil

# ML & data helper code!
Jakub Langr (c) 2021

This is a CLI utility for speeding up basic Data Engineering/Data Science tasks

## Installation

```
$ pip install data-toolkit
```

## Development

This project includes a number of helpers in the `Makefile` to streamline common development tasks.

### Environment Setup

The following demonstrates setting up and working with a development environment:

```
### create a virtualenv for development

$ make virtualenv

$ source env/bin/activate


### run dt cli application

$ dt --help


### run pytest / coverage

$ make test
```


## TODO: Central Data Repository

As a PM, I want to be able to quickly look at all the data we have on S3 with sample images.

    * Needs a DB
    * Needs a way of sampling

