Metadata-Version: 2.3
Name: mlops-python-sdk
Version: 1.0.6
Summary: MLOps Python SDK for XCloud Service API
License: MIT
Author: mlops
Author-email: mlops@example.com
Requires-Python: >=3.9,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: attrs (>=23.2.0)
Requires-Dist: httpx (>=0.27.0,<1.0.0)
Requires-Dist: packaging (>=24.1)
Requires-Dist: python-dateutil (>=2.8.2)
Requires-Dist: typing-extensions (>=4.1.0)
Project-URL: Bug Tracker, https://github.com/xcloud-service/xservice/issues
Project-URL: Homepage, https://mlops.cloud/
Project-URL: Repository, https://github.com/xcloud-service/xservice
Description-Content-Type: text/markdown

# SDK

Software Development Kits for integrating with the XCloud Service API.

> [!NOTE] SDK Support
> SDKs provide type-safe, high-level interfaces for interacting with the platform API. They handle authentication, error handling, and request retries automatically.


## Installation

The Python SDK installation.

```bash
pip install mlops-python-sdk
```

### Configuration

The SDK reads configuration from environment variables by default:

- `MLOPS_API_KEY`: API key (required)
- `MLOPS_DOMAIN`: API domain, e.g. `localhost:8090` or `https://example.com`
- `MLOPS_API_PATH`: API path prefix (default: `/api/v1`)
- `MLOPS_DEBUG`: `true|false` (default: `false`)

Or configure in code:

```python
from mlops import ConnectionConfig, Task

config = ConnectionConfig(
    api_key="xck_...",
    domain="https://example.com",
    api_path="/api/v1",
    debug=False,
)
task = Task(config=config)
```

## SDK Usage

### Initialize client

```python
from mlops import Task

task = Task()  # uses environment variables by default
```

### Submit a GPU task

```python
from mlops import Task

task = Task()
resp = task.submit(
    name="gpu-task-from-sdk",
    cluster_name="slurm-cn",
    team_id=1,
    image="/mnt/minio/images/01ai-registry.cn-shanghai.cr.aliyuncs.com+public+llamafactory+0.9.3.sqsh",
    entry_command="llamafactory-cli train /workspace/config/test_lora.yaml",
    resources={
        "partition": "gpu",
        "nodes": 2,
        "ntasks": 2,
        "cpus_per_task": 2,
        "memory": "4G",
        "time": "01:00:00",
        "gres": "gpu:nvidia_a10:1",
        "qos": "qos_xcloud",
    },
    file_path="/path/to/xservice.zip",  # optional: .zip/.tar.gz/.tgz
)
print(resp.job_id)
```

### Submit a CPU task

```python
from mlops import Task

task = Task()
resp = task.submit(
    name="cpu-task-from-sdk",
    cluster_name="slurm-cn",
    team_id=1,
    image="docker://01ai-registry.cn-shanghai.cr.aliyuncs.com/01-ai/xcs/v2/alpine:3.23.0",
    entry_command="echo hello",
    resources={
        "partition": "cpu",
        "nodes": 1,
        "ntasks": 1,
        "cpus_per_task": 1,
        "memory": "1G",
        "time": "01:00:00",
        "qos": "qos_xcloud",
    },
)
print(resp.job_id)
```

### List tasks

```python
from mlops import Task
from mlops.api.client.models.task_status import TaskStatus

task = Task()
resp = task.list(status=TaskStatus.COMPLETED, cluster_name="slurm-cn", page=1, page_size=20)
print(len(resp.tasks or []))
```

### Get task details

```python
from mlops import Task

task = Task()
task_info = task.get(task_id=12345, cluster_name="slurm-cn")
print(task_info)
```

### Cancel a task

```python
from mlops import Task

task = Task()
task.cancel(task_id=12345, cluster_name="slurm-cn")
```

### Delete a task

```python
from mlops import Task

task = Task()
task.delete(task_id=12345, cluster_name="slurm-cn")
```

**Task Management Methods:**

- `submit()` - Submit a new task with container image and entry command
- `get()` - Get task details by task ID
- `list()` - List tasks with optional filters (status, cluster_name, team_id, user_id)
- `cancel()` - Cancel a running task
- `delete()` - Delete a task record

**Task Status Values:**

```python
from mlops.api.client.models.task_status import TaskStatus

TaskStatus.PENDING      # Task is pending
TaskStatus.QUEUED       # Task is queued
TaskStatus.RUNNING      # Task is running
TaskStatus.COMPLETED    # Task completed successfully
TaskStatus.SUCCEEDED    # Task succeeded
TaskStatus.FAILED       # Task failed
TaskStatus.CANCELLED    # Task was cancelled
TaskStatus.CREATED      # Task was created
```

**Error Handling:**

```python
from mlops.exceptions import (
    APIException,
    AuthenticationException,
    NotFoundException,
    RateLimitException,
    TimeoutException,
    InvalidArgumentException,
    NotEnoughSpaceException
)
from mlops import Task

task = Task()

try:
    result = task.submit(
        name="test",
        cluster_name="slurm-cn",
        image="docker://alpine:3.23.0",
        entry_command="echo hello",
    )
except AuthenticationException as e:
    print(f"Authentication failed: {e}")
except NotFoundException as e:
    print(f"Resource not found: {e}")
except APIException as e:
    print(f"API error: {e}")
```

> [!TIP] Error Handling
> SDKs automatically parse typed responses and raise structured exceptions.

## Features

- Type-safe API clients
- Automatic authentication
- Error handling
- Typed response parsing (generated models)
- Unexpected-status guard (optional)

## Resources

- [Python SDK Documentation](https://github.com/xcloud-service/xservice/tree/main/client/python-sdk)
- [API Reference](https://xcloud-service.com/docs/api)

