Metadata-Version: 2.3
Name: pipecat-dashscope
Version: 0.2.0
Summary: DashScope integration for Pipecat
Requires-Dist: dashscope>=1.24.0
Requires-Dist: pipecat-ai>=0.0.80
Requires-Dist: websockets>=13.1,<16
Requires-Python: >=3.11
Description-Content-Type: text/markdown

# pipecat-dashscope

[中文文档](README.zh-CN.md)

`pipecat-dashscope` provides native DashScope service integrations for Pipecat.
This package uses DashScope native SDK integrations and does not rely on OpenAI-compatible endpoints.

## Why Native DashScope APIs

- Pipecat pipelines are latency-sensitive and depend on realtime streaming/event semantics.
- Chat Completions and Responses APIs are typically not sufficient for low-latency turn handling in voice agents.
- Native DashScope SDK integrations keep behavior aligned with DashScope protocol families (`Generation`, `MultiModalConversation`, `tts_v2`, and `qwen_tts_realtime`).
- Use `examples/realtime_api_check.py` to verify that your endpoint supports the Realtime API before running voice pipelines.

## Features

- Native DashScope `Generation` LLM integration (`DashScopeGenerationLLMService`)
- Native DashScope `MultiModalConversation` LLM integration (`DashScopeMultiModalLLMService`)
- Native DashScope ASR integration for segmented STT
- Native DashScope `tts_v2` TTS integration
- Native DashScope `qwen_tts_realtime` integration
- Native DashScope `MultiModalConversation` TTS integration
- Runtime-updatable Pipecat service settings
- Compatible with Pipecat `LLMContext` and pipeline processors
- Environment-variable based configuration for DashScope credentials

## Installation

```bash
uv add pipecat-dashscope
```

## Usage

The recommended usage is the end-to-end voice bot in `examples/bot.py`, which wires this pipeline:

`transport.input() -> DashScopeSTTService -> user_aggregator -> DashScope LLM -> DashScope TTS -> transport.output() -> assistant_aggregator`

Set your API key and run a preset:

```bash
export DASHSCOPE_API_KEY="your_api_key"
uv run --dev examples/bot.py --preset default
```

Available presets in `examples/bot.py`:

- `default`: STT `fun-asr-flash-8k-realtime`, LLM `generation/qwen3-max`, TTS `v2/cosyvoice-v3-flash`, voice `longanyang`
- `fast`: STT `fun-asr-flash-8k-realtime`, LLM `generation/qwen-plus`, TTS `v2/cosyvoice-v2`, voice `longxiaochun_v2`
- `quality`: STT `fun-asr-flash-8k-realtime`, LLM `generation/qwen3-max`, TTS `multimodal/qwen-tts`, voice `Cherry`
- `realtime`: STT `fun-asr-flash-8k-realtime`, LLM `multimodal/qwen3.6-flash-2026-04-16`, TTS `qwen-realtime/qwen-tts-realtime`, voice `Cherry`

Override any preset setting with CLI options:

```bash
uv run --dev examples/bot.py \
  --preset realtime \
  --llm-service multimodal \
  --llm-model qwen3.6-flash-2026-04-16 \
  --tts-service qwen-realtime \
  --tts-model qwen-tts-realtime \
  --tts-voice Cherry
```

Supported override flags:

- `--stt-model`
- `--llm-model`
- `--llm-service` (`generation`, `multimodal`)
- `--tts-service` (`v2`, `qwen-realtime`, `multimodal`)
- `--tts-model`
- `--tts-voice`

## Configuration

- `DASHSCOPE_API_KEY`: required if `api_key=` is not passed for any service
- `DASHSCOPE_BASE_URL`: optional override for both `DashScopeGenerationLLMService` and `DashScopeMultiModalLLMService`

Default LLM API base URL:

```text
https://dashscope.aliyuncs.com/api/v1
```

Notes:

- `DashScopeGenerationLLMService` uses DashScope native async `AioGeneration`.
- `DashScopeMultiModalLLMService` uses DashScope native async `AioMultiModalConversation`.
- `DashScopeSTTService` is a segmented STT service and expects VAD in the Pipecat pipeline.
- `DashScopeTTSV2Service` uses `dashscope.audio.tts_v2.SpeechSynthesizer`.
- `DashScopeQwenRealtimeTTSService` uses `dashscope.audio.qwen_tts_realtime`.
- `DashScopeMultiModalTTSService` uses `dashscope.MultiModalConversation` with TTS-capable Qwen models.
- All DashScope TTS services require explicit `model` and `voice` values (no built-in runtime defaults).
- Keep these three TTS API families separate when extending the package; do not merge them into a single service unless DashScope unifies the underlying protocol.

## Example: `examples/bot.py`

`examples/bot.py` is an end-to-end Pipecat voice-agent demo that wires:

- `DashScopeSTTService` (speech to text)
- `DashScopeGenerationLLMService` or `DashScopeMultiModalLLMService`
- `DashScopeTTSV2Service`, `DashScopeQwenRealtimeTTSService`, or `DashScopeMultiModalTTSService`

The script provides preset pipeline profiles (`default`, `fast`, `quality`, `realtime`) and supports overriding STT/LLM/TTS model, service family, and voice via CLI options.

Run it from this package directory:

```bash
uv run --dev examples/bot.py --preset quality
```

The example always uses SmallWebRTC transport and forwards other Pipecat runner options as needed.

Requirements:

- Set `DASHSCOPE_API_KEY` in your environment.
- Ensure `pipecat-ai` runner extras are installed (the package `dev` dependency group includes them).

## Testing

- Prefer unit tests around request shaping, settings translation, and audio payload decoding.
- Avoid live DashScope network tests in the default test path.
