Metadata-Version: 2.1
Name: csvwlib
Version: 0.3.2
Summary: Python implementation of CSV on the Web
Home-page: https://github.com/DerwenAI/csvwlib
Author: Aleksander Drozd
Author-email: aleksander.drozd@outlook.com
License: MIT
Project-URL: Bug Tracker, https://github.com/DerwenAI/csvwlib/issues
Project-URL: Source Code, https://github.com/DerwenAI/csvwlib
Keywords: knowledge graph,rdf,controlled vocabulary,csv,tabular data
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Human Machine Interfaces
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Text Processing :: Indexing
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: python-dateutil (>=2.6.1)
Requires-Dist: rdflib (>=4.2.2)
Requires-Dist: rdflib-jsonld (>=0.4.0)
Requires-Dist: requests (>=2.20.0)
Requires-Dist: uritemplate (>=3.0.0)
Requires-Dist: language-tags (>=0.4.3)

## About

`csvwlib` is a Python implementation of the W3C 
[CSV on the Web](http://w3c.github.io/csvw/) recommendations.

This enables converting tabular data, and optionally its associated
metadata, to a semantic graph in RDF or JSON-LD format.

Tabular data includes CSV files, TSV files, and upstream may be
coming from spreadsheets, RDBMS export, etc.

Requires Python 3.6 or later.


## Installation

```
pip install csvwlib
```


## Usage

The library exposes one class - `CSVWConverter` which has methods `to_json()` and `to_rdf()`

Both of these methods have similar API, and require 3+ parameters: 

  * `csv_url` - URL of a CSV file; default `None`
  * `metadata_url` - optional URL of a metadata file; default `None`
  * `mode` - conversion mode; default `standard`, or `minimal`

The are three ways of starting the conversion process:

  * pass only `csv_url` - corresponding metadata will be looked up based on `csv_url` as described in [Locating Metadata](https://www.w3.org/TR/2015/REC-tabular-data-model-20151217/#locating-metadata)

  * pass both `csv_url` and `metadata_url` - metadata by user will be used. If `url` field is set in metadata, the CSV file will be retrieved from that location which can cause, that passed `csv_url` will be ignored

  * pass only `metadata_url` - associated CSV files will be retrieved based on metadata `url` field  


You can also specify the conversion mode - `standard` or `minimal`, the default is `standard`.
From the [W3C documentation](https://www.w3.org/TR/2015/REC-csv2rdf-20151217/):

> **Standard** mode conversion frames the information gleaned from the cells of the tabular data with details of the rows, tables, and a group of tables within which that information is provided.  
> **Minimal** mode conversion includes only the information gleaned from the cells of the tabular data.

After conversion to JSON, you receive a `dict` object, when converting to RDF it is more complex.
If you pass `format` parameter, graph will be serialized to this format and returned as string. 
From the `rdflib` docs:

> Format support can be extended with plugins, but "xml", "n3", "turtle", "nt", "pretty-xml", "trix", "trig" and "nquads" are built in.

If you don't specify the format, you will receive a `rdflib.Graph` object. 


## Examples

Example data+metadata files can be found at 
<http://w3c.github.io/csvw/tests/>

Starting with CSV:
```python
from csvwlib import CSVWConverter

CSVWConverter.to_rdf("http://w3c.github.io/csvw/tests/test001.csv", format="ttl")
```

Minimal mode:
```python
from csvwlib import CSVWConverter

CSVWConverter.to_rdf("http://w3c.github.io/csvw/tests/tree-ops.csv", mode="minimal", format="ttl")
```

Starting with metadata only:
```python
from csvwlib import CSVWConverter

CSVWConverter.to_rdf(metadata_url="http://w3c.github.io/csvw/tests/test188-metadata.json", format="ttl")
```

Both CSV and metadata URL specified:
```python
from csvwlib import CSVWConverter

CSVWConverter.to_rdf("http://w3c.github.io/csvw/tests/tree-ops.csv", "http://w3c.github.io/csvw/tests/tree-ops.csv", format="ttl")
```

Starting with metadata:
```python
from csvwlib import CSVWConverter

CSVWConverter.to_json("http://w3c.github.io/csvw/tests/countries.json")
```

Starting with CSV:
```python
from csvwlib import CSVWConverter

CSVWConverter.to_json("http://w3c.github.io/csvw/tests/test001.csv")
```


## Contributors

Authored by [@Aleksander-Drozd](https://github.com/Aleksander-Drozd)

Maintained by [@DerwenAI](https://github.com/DerwenAI)

