Metadata-Version: 2.4
Name: arango_cve_processor
Version: 1.1.9
Summary: A small python script that enriches Vulnerability STIX Objects with other intel
Project-URL: Homepage, https://github.com/muchdogesec/arango_cve_processor
Project-URL: Issues, https://github.com/muchdogesec/arango_cve_processor/issues
Project-URL: dogesec HQ, https://dogesec.com
Author: dogesec
Maintainer: dogesec
License-File: LICENSE
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.9
Requires-Dist: arango-cve-processor
Requires-Dist: jsonschema>=4.22.0; python_version >= '3.8'
Requires-Dist: python-arango>=7.9.1; python_version >= '3.8'
Requires-Dist: requests>=2.32.4; python_version >= '3.7'
Requires-Dist: stix2
Requires-Dist: stix2arango
Requires-Dist: stix2extensions
Requires-Dist: tqdm>=4.66.4; python_version >= '3.7'
Description-Content-Type: text/markdown

# Arango CVE Processor

## Before you get started

Arango CVE Processor is built into [Vulmatch](https://github.com/muchdogesec/vulmatch) which also handles the download of CVE objects (what you need for ACVEP to work). As such, [Vulmatch](https://github.com/muchdogesec/vulmatch) is probably better suited to what you're looking for.

## tl;dr

![](docs/arango_cve_processor.png)

A small script that enriches CVEs to other sources with all data stored as STIX 2.1 objects.

[![arango_cve_processor](https://img.youtube.com/vi/J_LbAzoUpd4/0.jpg)](https://www.youtube.com/watch?v=J_LbAzoUpd4)

[Watch the demo](https://www.youtube.com/watch?v=J_LbAzoUpd4).

## Overview

Here at DOGESEC we work with a lot of CVE data across our products. [cve2stix](https://github.com/muchdogesec/cve2stix) generates core STIX 2.1 Vulnerability objects from CVE data.

However, we have lots of other sources (EPSS, KEV, ATT&CK...) that we want to enrich this data with.

We built Arango CVE Processor to handle the generation and maintenance of these enrichments.

In short, Arango CVE Processor is a script that;

1. reads the ingested CVE STIX data in ArangoDB
2. creates STIX objects to represent the relationships between CVE and other datasets

## Usage

### Install the script

```shell
# clone the latest code
git clone https://github.com/muchdogesec/arango_cve_processor
# create a venv
cd arango_cve_processor
python3 -m venv arango_cve_processor-venv
source arango_cve_processor-venv/bin/activate
# install requirements
pip3 install -r requirements.txt
````

### Configuration options

Arango CVE Processor has various settings that are defined in an `.env` file.

To create a template for the file:

```shell
cp .env.example .env
```

To see more information about how to set the variables, and what they do, read the `.env.markdown` file.

### Run

```shell
python3 arango_cve_processor.py \
    --database DATABASE \
    --relationship RELATIONSHIP \
    --ignore_embedded_relationships BOOLEAN \
    --modified_min DATE \
    --cve_id CVE-NNNN-NNNN CVE-NNNN-NNNN
```

Where;

* `--database` (required): the arangoDB database name where the objects you want to link are found. It must contain the collections `nvd_cve_vertex_collection` and `nvd_cve_edge_collection`
* `--relationship` (optional, dictionary): you can apply updates to certain relationships at run time. Default is all. Note, you should ensure your `database` contains all the required seeded data. User can select from;
  * `cve-cwe`
  * `cve-capec`
  * `cve-attack`
  * `cve-epss`
  * `cve-kev`
  * `cve-vulncheck-kev`
  * `cpematch`
* `--ignore_embedded_relationships` (optional, boolean). Default is `false`. if `true` passed, this will stop any embedded relationships from being generated. This is a stix2arango feature where STIX SROs will also be created for `_ref` and `_refs` properties inside each object (e.g. if `_ref` property = `identity--1234` and SRO between the object with the `_ref` property and `identity--1234` will be created). See stix2arango docs for more detail if required, essentially this a wrapper for the same `--ignore_embedded_relationships` setting implemented by stix2arango
* `--ignore_embedded_relationships_sro` (optional): boolean, if `true` passed, will stop any embedded relationships from being generated from SRO objects (`type` = `relationship`). Default is `false`
* `--ignore_embedded_relationships_smo` (optional): boolean, if `true` passed, will stop any embedded relationships from being generated from SMO objects (`type` = `marking-definition`, `extension-definition`, `language-content`). Default is `false`
* `--modified_min` (optional, date in format `YYYY-MM-DD`). By default arango_cve_processor will consider all CVEs in the database specified with the property `_is_latest==true` (that is; the latest version of the object). Using this flag with a modified time value will further filter the results processed by arango_cve_processor to STIX objects with a `modified` time >= to the value specified. This is useful when you don't want to process data for very old CVEs in the database.
  * NOTE: for `cpematch`, this is the `modified` time reported by the CPE Match API (it has nothing to do with CVE)
* `--created_min` (optional, date in format `YYYY-MM-DD`). Same as `modified_min` but considers `created` date.
  * NOTE: this does not work with `cpematch`
* `--cve_id` (optional, CVE ID): will only process the relationships for the CVE passed, otherwise all CVEs will be considered.
  * NOTE: this does not work with `cpematch`

### Examples

Process CVE -> CWE relationships for all CVEs modified after 2023-01-01 and only created embedded relationships from SDOs and SCOs...

```shell
python3 arango_cve_processor.py \
  --database arango_cve_processor_standard_tests_database \
  --relationship cve-cwe \
  --modified_min 2024-02-01 \
  --ignore_embedded_relationships true \
  --ignore_embedded_relationships_sro true \
  --ignore_embedded_relationships_smo true
```

Get all EPSS scores for CVEs

```shell
python3 arango_cve_processor.py \
  --database arango_cve_processor_standard_tests_database \
  --relationship cve-epss \
  --ignore_embedded_relationships false \
  --ignore_embedded_relationships_sro true \
  --ignore_embedded_relationships_smo true
```

Update all CPE Matches modified after `2024-02-01`

```shell
python3 arango_cve_processor.py \
  --database arango_cve_processor_standard_tests_database \
  --relationship cpematch \
  --modified_min 2024-02-01 \
  --ignore_embedded_relationships false \
  --ignore_embedded_relationships_sro true \
  --ignore_embedded_relationships_smo true
```

## Backfilling data

[stix2arango contains a set of utility scripts that can be used to backfill all the datasources required for this test](https://github.com/muchdogesec/stix2arango/tree/main/utilities).

## How it works

If you would like to know how the logic of this script works in detail, please consult the `/docs` directory.

## Useful supporting tools

* To generate STIX 2.1 extensions: [stix2 Python Lib](https://stix2.readthedocs.io/en/latest/)
* STIX 2.1 specifications for objects: [STIX 2.1 docs](https://docs.oasis-open.org/cti/stix/v2.1/stix-v2.1.html)
* [ArangoDB docs](https://www.arangodb.com/docs/stable/)

## Support

[Minimal support provided via the DOGESEC community](https://community.dogesec.com/).

## License

[Apache 2.0](/LICENSE).