Metadata-Version: 2.4
Name: mangodesk
Version: 0.5.2
Summary: tau-bench wrapper with cloud dashboard and trajectory tracking
Project-URL: Homepage, https://mangodesk.com
Project-URL: Documentation, https://docs.mangodesk.com
Project-URL: Repository, https://github.com/mangodesk-inc/mangodesk-tool-use
Project-URL: Issues, https://github.com/mangodesk-inc/mangodesk-tool-use/issues
Author-email: MangoDesk <team@mangodesk.com>
License: Proprietary
Keywords: ai,benchmarking,evaluation,llm,rl,tool-use
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: Other/Proprietary License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: click>=8.0.0
Requires-Dist: httpx>=0.24.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: rich>=13.0.0
Description-Content-Type: text/markdown

# MangoDesk CLI

A wrapper around [tau-bench](https://github.com/sierra-research/tau-bench) that adds cloud dashboard and trajectory tracking capabilities.

## Installation

```bash
# Using pipx (recommended)
pipx install mangodesk

# Using pip
pip install mangodesk
```

## Updating

```bash
# Using pipx
pipx upgrade mangodesk

# Using pip
pip install --upgrade mangodesk
```

## Quick Start

```bash
# 1. Initialize tau-bench (clones and installs the repository)
mangodesk init

# 2. Set your API keys
mangodesk set MANGO_API_KEY=sk-mango-xxx      # From https://mangodesk-api-ezq8.onrender.com
mangodesk set OPENAI_API_KEY=sk-xxx           # For OpenAI models
mangodesk set ANTHROPIC_API_KEY=sk-xxx        # For Anthropic models

# 3. Run evaluation
mangodesk eval --env retail --model gpt-4o --model-provider openai

# 4. View results in the dashboard
# The URL is printed after the evaluation completes
```

## Commands

### Initialize tau-bench

```bash
# Clone to ~/.mangodesk/tau-bench (default)
mangodesk init

# Clone to a custom location
mangodesk init /path/to/tau-bench
```

Downloads and installs the tau-bench repository.

### Run evaluation

```bash
mangodesk eval --env <environment> --model <model> [options]
```

**Required:**
- `--env, -e` - Environment name (retail, airline, telecom)
- `--model, -m` - Model name (e.g., gpt-4o, claude-3-5-sonnet)

**Optional:**
- `--model-provider, -p` - Model provider (openai, anthropic, google, mistral). Auto-detected from model name.
- `--agent-strategy` - Agent strategy: tool-calling (default), react, act
- `--user-model` - User simulator model (defaults to --model)
- `--user-model-provider` - User simulator provider (defaults to --model-provider)
- `--user-strategy` - User strategy: llm (default), react, verify, reflection
- `--max-concurrency` - Parallel task execution (default: 1)
- `--task-ids, -t` - Comma-separated list of task IDs to run
- `--trials` - Number of trials per task for pass^k metrics (default: 1)
- `--no-upload` - Skip uploading results to MangoDesk cloud
- `--tau-bench-path` - Path to tau-bench installation (auto-detected)

**Examples:**

```bash
# Basic evaluation
mangodesk eval --env retail --model gpt-4o --model-provider openai

# With Anthropic Claude
mangodesk eval --env airline --model claude-3-5-sonnet --model-provider anthropic

# Full configuration
mangodesk eval --env retail \
    --model gpt-4o --model-provider openai \
    --user-model gpt-4o --user-strategy llm \
    --agent-strategy tool-calling \
    --max-concurrency 10

# Run specific tasks
mangodesk eval --env retail -m gpt-4o -p openai --task-ids 1,2,3

# Multiple trials for pass^k metrics
mangodesk eval --env retail -m gpt-4o -p openai --trials 3
```

### Configuration

```bash
# Set a configuration value
mangodesk set KEY=value

# View current configuration
mangodesk config
```

**Configuration Keys:**
- `MANGO_API_KEY` - MangoDesk API key for cloud upload
- `OPENAI_API_KEY` - OpenAI API key
- `ANTHROPIC_API_KEY` - Anthropic API key
- `TAU_BENCH_PATH` - Path to tau-bench installation
- `API_URL` - MangoDesk API URL (for self-hosted)

## tau-bench Environments

MangoDesk supports all tau-bench environments:

- **retail** - E-commerce customer service
- **airline** - Flight booking and management
- **telecom** - Telecommunications support

See the [tau-bench repository](https://github.com/sierra-research/tau-bench) for environment details.

## License

Proprietary - Copyright © MangoDesk Inc. All rights reserved.
