Metadata-Version: 2.1
Name: bytez
Version: 0.2.4
Summary: Python API client for Bytez service
Home-page: https://github.com/bytez-com/docs
Author: Bytez
License: UNKNOWN
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Description-Content-Type: text/markdown
Requires-Dist: requests (>=2.32.1)

# API Documentation

## Introduction

Welcome to the Bytez API documentation! This API provides access to various machine learning models for serverless operation. Below, you will find examples demonstrating how to interact with the API using our Python client library.

## Getting Your Key

To use this API, you need an API key. Obtain your key by joining the [Bytez Discord](https://discord.gg/Zrd5UbMEBA). If you prefer not to use Discord, email us at team@bytez.com.

## Boot Times and Billing

### Cold Boot Times

Expect the following boot times for models:

- Smallest model: ~12 minutes.
- Largest model: ~15 minutes.
  We are working on reducing these boot times to under 5 minutes.

### Billing

Billing begins from the first 60 seconds of use, with subsequent usage rounded to the nearest minute. Charges are based on $0.0000166667 per GB-second on GPUs. The default expiration period for a model instance is 30 minutes.

## Python Client Library Usage Examples

## Authentication

Always include your API key when initializing the client:

```python
from bytez import Bytez

client = Bytez('YOUR_API_KEY')
```

### List Available Models

Retrieve a list of all available models.

```python
models = client.list_models()

print(models)
```

### List Serverless Instances

Get a list of your active serverless model instances.

```python
instances = client.list_instances()

print(instances)
```

### Make a Model Serverless

Submit a job to make a specific model serverless.

```python
model_id = 'openai-community/gpt2'

job_status = client.process(model_id)

print(job_status)
```

### Load a Model

Load a model using a HuggingFace model ID.

```python
model = client.model('openai-community/gpt2')
```

### Start a model

Start a serverless instance of the model.

```python

results = model.start('openai-community/gpt2', {'concurrency': 1, 'timeout': 300})

print(results)
```

### Check Model Status

Check on the status of the model, to see if its deploying, running, or stopped

```python
status = model.status()

print(status)
```

### Run a Model

Execute a model with the provided input and optional inference parameters.

```python
output = model.run("Once upon a time there was a")

print(output)
```

### Run a Model with HuggingFace params

Execute a model with the provided input and optional inference parameters.

```python
output = model.run("Once upon a time there was a", model_params={"max_new_tokens":1,"min_new_tokens":1})

print(output)
```

### Stream the response

Execute a model with the provided input and optional inference parameters.

```python
output = model.run("Once upon a time there was a", stream=True)

for chunk in stream:
  print(chunk)
```

### Shutdown a Model

Stop a model and shut down the serverless instance.

```python

model.stop()
```

## Feedback

We value your feedback to improve our documentation and services. If you have any suggestions, please join our [Discord](https://discord.gg/Zrd5UbMEBA) or contact us via email.


