Metadata-Version: 2.4
Name: preclick
Version: 0.1.0
Summary: Async Python client for the PreClick URL security scanning service. Assess target URLs for potential threats and alignment with the user's browsing intent before navigation.
Project-URL: Homepage, https://github.com/cybrlab-ai/preclick-python
Project-URL: Repository, https://github.com/cybrlab-ai/preclick-python
Project-URL: Issues, https://github.com/cybrlab-ai/preclick-python/issues
Author-email: "CybrLab.ai" <contact@cybrlab.ai>
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: agent,ai-safety,intent-alignment,llm,mcp-client,phishing-detection,preclick,safe-browsing,security,url-scanner,url-security
Classifier: Development Status :: 4 - Beta
Classifier: Framework :: AsyncIO
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Security
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: mcp>=1.24.0
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.24; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Description-Content-Type: text/markdown

# preclick

> Async Python client for the [PreClick](https://preclick.ai) URL security scanning service. Assess target URLs for potential threats and alignment with the user's browsing intent before navigation.

**Publisher:** [CybrLab.ai](https://cybrlab.ai) | **Service:** [PreClick](https://preclick.ai)

Scan-oriented public API: two primary methods (`scan`, `scan_with_intent`) plus five job methods for explicit control. No protocol vocabulary. No `connect()` required. No polling boilerplate for the common case.

Trial mode currently allows up to 100 requests per day with no API key and no sign-up required. Trial limits, availability, and higher-limit access are subject to change and may be governed by separate hosted-service terms. For higher limits, configure an API key (see [Configuration](#configuration)).

---

## Install

```bash
pip install preclick
```

Requires Python **>= 3.10**.

## Quick start

```python
import asyncio
from preclick import PreClickClient

async def main():
    async with PreClickClient(
        api_key="sk-...",  # optional; trial mode if omitted
    ) as client:
        result = await client.scan("https://example.com")

        print(result["agent_access_directive"])  # ALLOW | DENY | RETRY_LATER | REQUIRE_CREDENTIALS
        print(result["agent_access_reason"])

asyncio.run(main())
```

That's it. The first call auto-connects to the PreClick service and returns the scan result as soon as it's ready (typically 70--80 seconds). The `async with` block cleans up when you're done.

See [`examples/`](./examples/) for runnable scripts: `basic_scan.py`, `intent_scan.py`, `long_running_scan.py`, and `manual_polling.py`.

## Hosted service and data sent

This package is open-source client software for the hosted PreClick service. No scanner logic runs locally when using the default endpoint.

Calls to `scan()` / `start_scan()` send the target URL to `https://preclick.ai/mcp`. Calls to `scan_with_intent()` / `start_scan_with_intent()` send both the target URL and the user intent text. If configured, the API key is sent as an `X-API-Key` header, along with any custom headers passed to the client.

Do not submit secrets, credentials, highly sensitive personal data, regulated data, private/internal URLs, or confidential business information unless you are authorized to do so and have reviewed the applicable service, privacy, retention, and acceptable-use terms for your use case. For hosted-service terms or privacy questions, contact [contact@cybrlab.ai](mailto:contact@cybrlab.ai).

## Intent-aware scanning

When the user has stated their purpose (login, purchase, download, booking, etc.), use `scan_with_intent` so the scanner can evaluate destination alignment in addition to threat signals:

```python
result = await client.scan_with_intent(
    "https://example.com",
    "log in to my bank",
)

print(result["agent_access_directive"])
print(result["intent_alignment"])
# misaligned | no_mismatch_detected | inconclusive | not_provided
```

## Interpreting results

Every completed scan returns a dict with these fields:

| Field                    | Type             | Description                                                             |
|--------------------------|------------------|-------------------------------------------------------------------------|
| `risk_score`             | float (0.0--1.0) | Threat probability                                                      |
| `confidence`             | float (0.0--1.0) | Analysis confidence                                                     |
| `analysis_complete`      | bool             | Whether the analysis finished fully                                     |
| `agent_access_directive` | str              | `ALLOW`, `DENY`, `RETRY_LATER`, or `REQUIRE_CREDENTIALS`                |
| `agent_access_reason`    | str              | Normalized reason code for the directive                                |
| `intent_alignment`       | str              | `misaligned`, `no_mismatch_detected`, `inconclusive`, or `not_provided` |

Use `agent_access_directive` for navigation decisions:

- **`ALLOW`** -- No blocking signal was detected. Continue only according to user confirmation, local policy, and normal security controls.
- **`DENY`** -- Do not navigate. Check `agent_access_reason` for the cause.
- **`RETRY_LATER`** -- Verification could not complete (temporary issue). Retry.
- **`REQUIRE_CREDENTIALS`** -- The target requires authentication. Ask the user how to proceed.

Scans typically take around 70--80 seconds on current production traffic. Both `scan()` and `scan_with_intent()` handle the polling internally; the default wait window is 10 minutes. Pass `max_wait` to override:

```python
await client.scan("https://example.com", max_wait=120)  # 2 minutes
```

## Timeouts and cancellation

Use `max_wait` to limit how long `scan()` / `scan_with_intent()` / `wait_for_scan()` will wait for a result. When the deadline is reached, a `PreClickError` is raised with `err.scan_id` attached so the already-submitted scan is never orphaned:

```python
from preclick import PreClickClient, PreClickError

async def scan_with_timeout():
    async with PreClickClient() as client:
        try:
            result = await client.scan(
                "https://example.com",
                max_wait=30,  # give up after 30s
            )
        except PreClickError as err:
            scan_id = getattr(err, "scan_id", None)
            if scan_id:
                # Resume later without re-submitting
                result = await client.wait_for_scan(scan_id)
```

Standard asyncio task cancellation (`task.cancel()`) also works. After a scan is submitted, any error raised by the client -- timeout, cancellation, transport failure -- carries `err.scan_id` so callers can resume with `wait_for_scan(scan_id)`.

> **Note:** Wrapping a scan call in `asyncio.wait_for()` works for cancellation, but the `TimeoutError` it raises does *not* carry `scan_id`. Prefer `max_wait` when you need to recover the scan ID on timeout.

All scan and job methods also accept a `signal` keyword for cooperative cancellation. Pass an `asyncio.Event`; if it is set before submission, no request is sent. If it is set while waiting, `asyncio.CancelledError` is raised and carries `err.scan_id` once a scan ID is known:

```python
import asyncio

cancel = asyncio.Event()

try:
    result = await client.scan("https://example.com", signal=cancel)
except asyncio.CancelledError as err:
    if getattr(err, "scan_id", None):
        result = await client.wait_for_scan(err.scan_id)
```

## Long-running scans (explicit control)

For the common case, `scan()` / `scan_with_intent()` is all you need. Use the **job methods** below when you need explicit control over the submission/wait lifecycle -- for example, reporting progress to a UI, integrating with an existing job queue, or persisting a scan ID across processes or workers.

### Submit once, wait later

```python
submission = await client.start_scan("https://example.com")
# ... hand submission["scan_id"] off to another function/worker/process ...
result = await client.wait_for_scan(submission["scan_id"])
print(result["agent_access_directive"])
```

`scan(url)` is literally `start_scan(url)` followed by `wait_for_scan(scan_id)` -- splitting it lets you control the two steps independently.

### Submit and poll manually

If you need a progress UI or you can't block on a single call, use `get_scan_result()` in your own loop:

```python
import asyncio

submission = await client.start_scan("https://example.com")

while True:
    envelope = await client.get_scan_result(submission["scan_id"])

    if envelope["status"] == "completed":
        if not envelope.get("result"):
            raise RuntimeError("scan completed but result is missing")
        print(envelope["result"]["agent_access_directive"])
        break

    if envelope["status"] != "working":
        raise RuntimeError(f"scan ended with status: {envelope['status']}")

    # Respect the server's recommended poll interval
    hint = envelope.get("retry_after_ms")
    wait_secs = hint / 1000.0 if isinstance(hint, (int, float)) and hint > 0 else 2.0
    await asyncio.sleep(wait_secs)
```

`get_scan_status(scan_id)` is the cheaper status-only variant -- it returns the current status without the result payload.

## API reference

### Primary scan API (convenience wrappers)

```python
# Scan a URL and return the result. Handles submission + polling internally.
await client.scan(url)
await client.scan(url, max_wait=120)
await client.scan(url, signal=cancel_event)

# Intent-aware variant. Use when the user has stated their purpose.
await client.scan_with_intent(url, intent)
await client.scan_with_intent(url, intent, max_wait=120)
await client.scan_with_intent(url, intent, signal=cancel_event)
```

Both return the inner scan result dict directly (see [Return shapes](#return-shapes) below).

### Job methods (explicit control)

```python
# Submit a URL for scanning. Returns a submission envelope with scan_id.
await client.start_scan(url)
await client.start_scan(url, signal=cancel_event)

# Submit with user intent.
await client.start_scan_with_intent(url, intent)
await client.start_scan_with_intent(url, intent, signal=cancel_event)

# Non-blocking status check for a scan.
await client.get_scan_status(scan_id)
await client.get_scan_status(scan_id, signal=cancel_event)

# Non-blocking result fetch. Returns { status, result, retry_after_ms, ... }.
await client.get_scan_result(scan_id)
await client.get_scan_result(scan_id, signal=cancel_event)

# Wait for a scan to complete and return the inner scan result payload.
await client.wait_for_scan(scan_id)
await client.wait_for_scan(scan_id, max_wait=600)
await client.wait_for_scan(scan_id, signal=cancel_event)
```

### Lifecycle

```python
await client.close()          # clean up (alias for disconnect)
client.is_connected            # bool
await client.connect()         # optional; first scan call auto-connects
await client.disconnect()      # same as close(); close() is preferred

# Context manager (recommended)
async with PreClickClient() as client:
    ...
```

## Return shapes

### `scan` / `scan_with_intent` / `wait_for_scan`

All three return the inner scan result directly:

```python
{
    "risk_score": 0.15,          # float 0.0--1.0
    "confidence": 0.92,          # float 0.0--1.0
    "analysis_complete": True,
    "agent_access_directive": "ALLOW",  # ALLOW | DENY | RETRY_LATER | REQUIRE_CREDENTIALS
    "agent_access_reason": "no_immediate_risk_detected",
    "intent_alignment": "not_provided",  # misaligned | no_mismatch_detected | inconclusive | not_provided
}
```

### `start_scan` / `start_scan_with_intent`

```python
{
    "scan_id": "550e8400-e29b-41d4-a716-446655440000",
    "status": "working",
    "status_message": "Queued for processing",
    "created_at": "2026-01-18T12:00:00Z",
    "updated_at": "2026-01-18T12:00:00Z",
    "ttl_ms": 720000,
    "poll_interval_ms": 2000,
    "message": "Scan submitted.",
}
```

### `get_scan_status`

```python
{
    "scan_id": "...",
    "status": "working",  # working | completed | failed | cancelled
    "status_message": "...",
    "created_at": "...",
    "updated_at": "...",
    "ttl_ms": 720000,
    "poll_interval_ms": 2000,
}
```

### `get_scan_result` (still running)

```python
{
    "scan_id": "...",
    "status": "working",
    "status_message": "...",
    "result": None,
    "retry_after_ms": 2000,
    "message": "Scan still in progress.",
}
```

### `get_scan_result` (completed)

```python
{
    "scan_id": "...",
    "status": "completed",
    "status_message": "Scan completed successfully",
    "result": {
        "risk_score": 0.15,
        "confidence": 0.92,
        "analysis_complete": True,
        "agent_access_directive": "ALLOW",
        "agent_access_reason": "no_immediate_risk_detected",
        "intent_alignment": "not_provided",
    },
    "retry_after_ms": None,
    "message": "Scan completed successfully.",
}
```

## Configuration

`PreClickClient(**kwargs)` accepts:

| Option            | Type    | Default                      | Description                                                                                                  |
|-------------------|---------|------------------------------|--------------------------------------------------------------------------------------------------------------|
| `api_key`         | `str`   | `None`                       | PreClick API key. Sent as `X-API-Key`. Trial mode (100 req/day) if omitted. Must be non-empty when provided. |
| `endpoint`        | `str`   | `https://preclick.ai/mcp`    | Override the PreClick endpoint URL.                                                                          |
| `client_name`     | `str`   | `preclick-mcp-client-python` | Reported client name.                                                                                        |
| `client_version`  | `str`   | package version              | Reported client version. Defaults to the installed package version.                                          |
| `headers`         | `dict`  | `{}`                         | Extra HTTP headers to send with every request. Values must be strings.                                       |
| `request_timeout` | `float` | `600`                        | Per-request timeout in seconds. Must be a positive number.                                                   |

To obtain an API key for higher limits, contact [contact@cybrlab.ai](mailto:contact@cybrlab.ai).

## Errors

The client raises three error types (all subclasses of `PreClickError`):

```python
from preclick import (
    PreClickError,
    PreClickConnectionError,
    PreClickRemoteError,
)

try:
    result = await client.scan("https://example.com")
except asyncio.CancelledError as err:
    # Caller cancelled the task. If err.scan_id is present, submission
    # had already completed and you can resume with
    # client.wait_for_scan(err.scan_id).
    pass
except PreClickConnectionError as err:
    # Transport / connection failure (network, DNS, TLS, handshake).
    # Check err.retryable -- False for malformed-endpoint config errors,
    # True for transient sockets/DNS/TLS failures.
    if err.retryable:
        pass  # safe to retry with backoff
except PreClickRemoteError as err:
    # Remote-side failure: scan failed / cancelled / expired, rate limit,
    # auth failure, etc. Inspect err.code and err.data for context.
    print(err.code, err.data)
except PreClickError as err:
    # Misuse (invalid arguments) or wait_for_scan max_wait exhausted.
    pass
```

## Troubleshooting

| Symptom                                                       | Cause                               | Fix                                                                                                 |
|---------------------------------------------------------------|-------------------------------------|-----------------------------------------------------------------------------------------------------|
| `PreClickConnectionError: Failed to connect ...`              | Endpoint unreachable                | Check network; verify `curl -I https://preclick.ai/mcp` returns a response                          |
| `PreClickConnectionError: Invalid PreClick endpoint ...`      | Invalid `endpoint` option           | Pass a valid `https://` URL as the `endpoint` option                                                |
| `PreClickRemoteError` with code 401                           | API key required or invalid         | Set a valid `api_key` in client options                                                             |
| `PreClickRemoteError` with code 429                           | Rate limit exceeded                 | Reduce frequency or add an API key for higher limits                                                |
| `PreClickError: wait_for_scan exhausted ...`                  | Scan did not complete within window | Pass a larger `max_wait` to `scan` / `scan_with_intent` / `wait_for_scan`, or use manual polling    |
| Scan takes a long time                                        | Target site is slow or complex      | Wait for completion; scans typically take 70--80 seconds                                            |

## Bundled agent skill

[`skills/preclick/SKILL.md`](./skills/preclick/SKILL.md) is a generic agent skill that instructs LLM agents on **when** to call `scan` vs `scan_with_intent`, and **how** to interpret the response. Drop it into any agent runtime that loads markdown skills to give the agent automatic preflight URL verification behavior.

## Important notice

This package is client software licensed under the Apache License 2.0. Use of the hosted PreClick service may be governed by separate service terms, acceptable-use rules, rate limits, privacy commitments, and law. Do not use the hosted service for unauthorized, unlawful, abusive, or malicious activity.

Before using the hosted service with production, regulated, personal, confidential, or private/internal data, make sure you are authorized to submit that data and have reviewed the applicable service, privacy, retention, and acceptable-use terms. For hosted-service terms or privacy questions, contact [contact@cybrlab.ai](mailto:contact@cybrlab.ai).

Scan results are informational risk signals, not a guarantee that a URL is safe or unsafe. They are not a substitute for user judgment, browser and endpoint security controls, organizational security review, or legal and compliance review.

## Support

- **Email:** [contact@cybrlab.ai](mailto:contact@cybrlab.ai)
- **Publisher:** [CybrLab.ai](https://cybrlab.ai)
- **Service:** [PreClick](https://preclick.ai)

## License

Copyright 2026 CybrLab.ai.

Licensed under the Apache License 2.0. See [`LICENSE`](./LICENSE).
