Metadata-Version: 2.1
Name: kgdata
Version: 3.0.0
Summary: Library to process dumps of knowledge graphs (Wikipedia, DBpedia, Wikidata)
Home-page: https://github.com/binh-vu/kgdata
License: MIT
Author: Binh Vu
Author-email: binh@toan2.com
Requires-Python: >=3.8,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Requires-Dist: beautifulsoup4 (>=4.9.3,<5.0.0)
Requires-Dist: chardet (>=4.0.0,<5.0.0)
Requires-Dist: cityhash (>=0.2.3,<0.3.0)
Requires-Dist: click (>=8.1.3,<9.0.0)
Requires-Dist: fastnumbers (>=3.1.0,<4.0.0)
Requires-Dist: hugedict (>=2.4.2,<3.0.0)
Requires-Dist: loguru (>=0.6.0)
Requires-Dist: lxml (>=4.9.0,<5.0.0)
Requires-Dist: numpy (>=1.22.3,<2.0.0)
Requires-Dist: orjson (>=3.6.8,<4.0.0)
Requires-Dist: parsimonious (>=0.8.1,<0.9.0)
Requires-Dist: pyserini (>=0.17.0,<0.18.0)
Requires-Dist: pyspark (==3.3.0)
Requires-Dist: rdflib (>=6.1.1,<7.0.0)
Requires-Dist: redis (>=3.5.3,<4.0.0)
Requires-Dist: requests (>=2.28.0,<3.0.0)
Requires-Dist: rsoup (>=2.5.1,<3.0.0)
Requires-Dist: ruamel.yaml (>=0.17.9,<0.18.0)
Requires-Dist: sem-desc (>=3.5.2,<4.0.0)
Requires-Dist: six (>=1.16.0,<2.0.0)
Requires-Dist: tqdm (>=4.64.0,<5.0.0)
Requires-Dist: ujson (>=5.1.0,<6.0.0)
Project-URL: Repository, https://github.com/binh-vu/kgdata
Description-Content-Type: text/markdown

# kgdata ![PyPI](https://img.shields.io/pypi/v/kgdata) ![Documentation](https://readthedocs.org/projects/kgdata/badge/?version=latest&style=flat)

KGData is a library to process dumps of Wikipedia, Wikidata. What it can do:

- Clean up the dumps to ensure the data is consistent (resolve redirect, remove dangling references)
- Create embedded key-value databases to access entities from the dumps.
- Extract Wikidata ontology.
- Extract Wikipedia tables and convert the hyperlinks to Wikidata entities.
- Create Pyserini indices to search Wikidata’s entities.
- and more

For a full documentation, please see[the website](https://kgdata.readthedocs.io/).

## Installation

From PyPI (using pre-built binaries):

```bash
pip install kgdata
```

