pyjstat is a python module for JSON-stat formatted data manipulation.
This module allows reading and writing JSON-stat [1] format with python, using data frame structures provided by the widely accepted pandas library [2]. The JSON-stat format is a simple lightweight JSON format for data dissemination. Pyjstat is inspired in rjstat [3], a library to read and write JSON-stat with R, by ajschumacher.
pyjstat is written and maintained by Miguel Expósito Martín and is distributed under the Apache 2.0 License (see LICENSE file).
| [1] | http://json-stat.org/ for JSON-stat information |
| [2] | http://pandas.pydata.org for Python Data Analysis Library information |
| [3] | https://github.com/ajschumacher/rjstat for rjstat library information |
Example
Importing a JSON-stat file into a pandas data frame can be done as follows:
import urllib2
import json
import pyjstat
results = pyjstat.from_json_stat(json.load(urllib2.urlopen(
'http://json-stat.org/samples/oecd-canada.json')))
print results
Custom JSON encoder class for Numpy data types.
Check and validate input params.
| Parameters: | naming (string) – a string containing the naming type (label or id). |
|---|---|
| Returns: | Nothing |
| Raises: | ValueError – if the parameter is not in the allowed list. |
Decode JSON-stat formatted data into pandas.DataFrame object.
| Parameters: |
|
|---|---|
| Returns: | results – list of pandas.DataFrame with imported data. |
| Return type: | list |
Decode JSON-stat dict into pandas.DataFrame object. Helper method that should be called inside from_json_stat().
| Parameters: |
|
|---|---|
| Returns: | output – pandas.DataFrame with converted data. |
| Return type: | DataFrame |
Generate row dimension values for a pandas dataframe.
| Parameters: |
|
|---|---|
| Yields: | list – list with pandas dataframe column values except for value column |
Get index from a given dimension.
| Parameters: |
|
|---|---|
| Returns: | dim_index – DataFrame with index-based dimension data. |
| Return type: | pandas.DataFrame |
Get label from a given dimension.
| Parameters: |
|
|---|---|
| Returns: | dim_label – DataFrame with label-based dimension data. |
| Return type: | pandas.DataFrame |
Get dimensions from input data.
| Parameters: |
|
|---|---|
| Returns: | dimensions – list of pandas data frames with dimension category data. dim_names (list): list of strings with dimension names. |
| Return type: | list |
Get values from input data.
| Parameters: |
|
|---|---|
| Returns: | values – list of dataset values. |
| Return type: | list |
Convert string to unicode depending on python version.
| Parameters: | text (string) – a string. |
|---|---|
| Returns: | text – a utf-8 enconded string. |
| Return type: | string |
Remove trailing zeroes from float values if they can be represented as integers.
| Parameters: | value (object) – a python object (hopefully a number). |
|---|---|
| Returns: | value – an integer or the same input object, depending on the content of value. |
| Return type: | int, object |
Convert variable to integer or string depending on the case.
| Parameters: | variable (string) – a string containing a real string or an integer. |
|---|---|
| Returns: | variable – an integer or a string, depending on the content of variable. |
| Return type: | int, string |
| Parameters: |
|
|---|---|
| Returns: | output – String with JSON-stat object. |
| Return type: | string |
Return unique values in a list in the original order. See: http://www.peterbe.com/plog/uniqifiers-benchmark
| Parameters: | seq (list) – original list. |
|---|---|
| Returns: | list without duplicates preserving original order. |
| Return type: | list |