Metadata-Version: 1.1
Name: pyin
Version: 0.4
Summary: basic streaming text processing
Home-page: https://github.com/geowurster/pyin
Author: Kevin Wurster
Author-email: wursterk@gmail.com
License: New BSD License

Copyright (c) 2015, Kevin D. Wurster
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
  list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

* The names of its contributors may not be used to endorse or promote products
  derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Description: pyin
        ====
        
        [![Build Status](https://travis-ci.org/geowurster/pyin.svg?branch=master)](https://travis-ci.org/geowurster/pyin) [![Coverage Status](https://coveralls.io/repos/geowurster/pyin/badge.svg?branch=master)](https://coveralls.io/r/geowurster/pyin?branch=master)
        
        Perform Python operations on every line read from `stdin`.  Every line is
        evaluated individually and available via a variable called `line`.
        
        
        Installing
        ----------
        
        Via pip:
        
            $ pip install git+https://github.com/geowurster/pyin.git
        
        From master branch:
            
            $ git clone https://github.com/geowurster/pyin
            $ pip install -e .
        
        
        Examples
        --------
        
        Change newline character in a CSV.
        
            $ more sample-data/csv-with-header.csv | pyin "line.replace('\n', '\r\n')" > output.csv
        
        Extract a BigQuery schema from an existing table and pretty print it:
        
        ```console
        $ bq show --format=json ${DATASET}.${TABLE} | pyin -m json -m pprint "pprint.pformat(json.loads(line)['schema']['fields'])"
        [{u'mode': u'NULLABLE', u'name': u'mmsi', u'type': u'STRING'},
        {u'mode': u'NULLABLE', u'name': u'longitude', u'type': u'FLOAT'},
        {u'mode': u'NULLABLE', u'name': u'latitude', u'type': u'FLOAT'}
        ...]
        ```
        
        Read the first 100K lines of a CSV and write the 
        
        head -100000 ${INFILE} | pyin -r csv.DictReader -m csv "line['Msg type'] == '5'" -n -t -l '' -w newlinejson.Writer -m newlinejson -wm writerow > ~/github/VesselInfo/Data/100K-Sample-Type5.json
        
        
        
        
        Gotchas
        -------
        
        It's easy to completely modify the line content:
        
            $ pyin -i sample-data/csv-with-header.csv "'operation'"
            operationoperationoperationoperationoperationoperation
        
        Forgetting to use `-t` to only get lines that evaluate as `True`:
        
            $ pyin -i LICENSE.txt "'are' in line"
            FalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalse
            
            $ pyin -i LICENSE.txt "'are' in line" -t
            modification, are permitted provided that the following conditions are met:
              derived from this software without specific prior written permission.
        
        Specifying JSON:
        
            $ -ro fieldnames='["field1","field2"]'
        
        Get a list of variables available by default to the `operation` argument:
        
            $ cat LICENSE.txt | pyin line -s "print(globals().keys()); exit()"
            ['main', '_str2type', '_STR_TYPES', '__all__', '_os', '__builtins__', '__source__', '__file__', '_click', '_DefaultReader', '_sys', '__package__', '__email__', '__author__', '_PY3', 'pyin', '__name__', '__version__', '__license__', '__doc__', '_DefaultWriter']
        
        The `--reader-option key=val` values are parsed to their Python type but if the user wants to
        specify something like which JSON library to use for a `newlinejson.Reader()`
        instance then they must do that via the `--statement` option:
        
            $ pyin \
                -i measures.json \
                -o ~/dec-type5.json \
                -r newlinejson.Reader \
                -s "newlinejson.core.JSON = ujson" \
                -w newlinejson.Writer \
                -im newlinejson \
                -t "'type' in line and line['type'] is 5" \
                -im ujson
        
        
        
        Developing
        ----------
        
        Install:
        
            $ pip install virtualenv
            $ git clone https://github.com/geowurster/pyin
            $ cd pyin
            $ virtualenv venv
            $ source venv/bin/activate
            $ pip install -r requirements-dev.txt
            $ pip install -e .
        
        Test:
            
            $ nosetests
        
        Coverage:
        
            $ nosetests --with-coverage
        
        Lint:
        
            $ pep8 --max-line-length=120 pyin.py
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Topic :: Text Processing :: Filters
Classifier: Topic :: Text Processing :: General
Classifier: Topic :: Utilities
