Metadata-Version: 2.1
Name: data-check
Version: 0.1.0
Summary: simple data validation
Home-page: https://github.com/andrjas/data_check
Author: Andreas Rjasanow
Author-email: andrjas@gmail.com
License: MIT
Keywords: data validation testing quality
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Other Audience
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Database
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: sqlalchemy (<1.4,>=1.3.22)
Requires-Dist: pandas (<1.2,>=1.1.5)
Requires-Dist: pyyaml (<5.4,>=5.3.1)
Requires-Dist: numpy (==1.19.3)
Requires-Dist: click (<7.2,>=7.1.2)
Provides-Extra: mssql
Requires-Dist: pyodbc (<4.1,>=4.0.30) ; extra == 'mssql'
Provides-Extra: mysql
Requires-Dist: pymysql[rsa] (<0.11,>=0.10.1) ; extra == 'mysql'
Provides-Extra: oracle
Requires-Dist: cx-Oracle (<8.2,>=8.1.0) ; extra == 'oracle'
Provides-Extra: postgres
Requires-Dist: psycopg2-binary (<2.9,>=2.8.6) ; extra == 'postgres'

# data_check

data_check is a simple data validation tool. Write SQL queries and CSV files with the expected result sets and data_check will test the result sets against the queries.

data_check should work with any database that works with [SQLAlchemy](https://docs.sqlalchemy.org/en/13/dialects/). Currently data_check is tested against PostgreSQL, MySQL, SQLite, Oracle and Microsoft SQL Server.

## Quickstart

You need Python 3.6 or above to run data_check. The easiest way to install data_check is via [pipx](https://github.com/pipxproject/pipx):

`pipx install data_check`

The data_check Git repository is also a sample data_check project. Clone the repository, switch to the folder and run data_check:

```
git clone git@github.com:andrjas/data_check.git
cd data_check
data_check
```

This will run the tests in the _checks_ folder using the default connection as set in data_check.yml.

See the [documentation](https://andrjas.github.com/data_check) how to install data_check in different environments with additional database drivers and other usages of data_check.

## Project layout

data_check has a simple layout for projects: a single configuration file and a folder with the test files. You can also organize the test files in subfolders.

    data_check.yml    # The configuration file
    checks/           # Default folder for data tests
        some_test.sql # SQL file with the query to run against the database
        some_test.csv # CSV file with the expected result
        subfolder/    # Tests can be nested in subfolders

## Documentation

See the [documentation](https://andrjas.github.com/data_check) how to setup data_check, how to create a new project and more options.


