Simulator Report

Metrics
forum
Total Conversations
-
sync
Avg Turns per Conversation
-
Performance Metrics

science How Scores Are Computed

forumConversations
chat id scenario id goal completion score info goal completion score indicates whether the agent has completed the goal of the user (ranges between 0-1). final score info a weighted sum of turn_success_ratio and goal_completion_score (goal_completion_weight = 0.25, turn_success_ratio_weight = 0.75). ranges between 0-1. status info the status of the conversation based on final_score:

done: final_score == 1.0 - perfect performance with no agent behavior failures and goal completed.

partial failure: final_score > 0.6 - some agent behavior failures or goal not completed, but performance is acceptable.

failed: final_score ≤ 0.6 - too many agent behavior failures or goal not completed, indicating poor performance.