Metadata-Version: 2.1
Name: codebooks
Version: 0.0.3
Summary: Automatic generation of codebooks from dataframes.
Home-page: https://github.com/mhowison/codebooks
Author: Mark Howison
Author-email: mark@howison.org
License: BSD
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: BSD License
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: htmlmin
Requires-Dist: pandas

# Codebooks

Automatically generate codebooks from dataframes. Includes methods to:
* Infer variable type (as unique key, indicator, categorical, or continuous).
* Summarize values with histograms and KDEs.
* Generate a self-contained HTML report (may be extended to PDF or other formats in the future).

Usage:

    codebooks -o output.html input.csv

## Adding variable descriptions

You can specify a csv file that maps variable names to descriptions using:

    codebooks --desc descriptions.csv -o output.html input.csv

The csv file is expected to have two columns (variable name, description).

## License

3-Clause BSD (see LICENSE)

## Tests

The `test/` subdirectory contains a script to generate a synthetic data set, an integration test for the codebooks package, and a benchmark script used to test performance optimizations. You can run these with:

    cd test
    python dataset.py
    codebooks dataset.csv
    python benchmark.py

## Authors

Mark Howison  
[http://mark.howison.org](http://mark.howison.org)
