EvalVault Analysis Report
Run ID: run-week4-current
Generated: 2025-12-29 20:12:49
Executive Summary
NLP Analysis
Text Statistics (Questions)
Total Words: 103
Total Sentences: 20
Avg Word Length: 3.7
Vocabulary Diversity: 80.6%Question Type Distribution
| Type | Count | Percentage |
| Factual | 16 | 80.0% |
| Comparative | 2 | 10.0% |
| Reasoning | 2 | 10.0% |
Top Keywords
얼마인가요, 무엇인가요, 따라, 보험료가, 또는, 종신보험의, 이상, 있습니다, 가능합니다, 받을NLP Insights
High vocabulary diversity in questions
Questions are short and concise
Dominant question type: factual (80%)
Top keywords: 얼마인가요, 무엇인가요, 따라Causal Analysis
Significant Factor Impacts
| Factor | Metric | Direction | Strength | Correlation |
| context_length | faithfulness | ↑ positive | strong | 0.526 |
| context_length | context_precision | ↑ positive | strong | 0.534 |
Strong Causal Relationships
context_length → faithfulness: Higher values increases scores (confidence: 0.82)
context_length → context_precision: Higher values increases scores (confidence: 0.83)Root Cause Analysis
faithfulness: Primary causes - context_length
- context_length positively affects faithfulness (r=0.53)
context_precision: Primary causes - context_length
- context_length positively affects context_precision (r=0.53)Recommended Interventions
🔴 faithfulness: Provide more comprehensive context information
- Expected: More detailed contexts may improve faithfulness
🔴 context_precision: Provide more comprehensive context information
- Expected: More detailed contexts may improve context_precisionCausal Insights
Found 2 significant factor-metric relationships out of 21 analyzed
Identified 2 strong causal relationships with confidence > 0.7
Most impactful factor: context_length on context_precision (r=0.53)
Most common root cause across metrics: context_length (appears in 2 metrics)
Factors with no significant impact: context_length, question_length, question_complexityRecommendations
No critical issues found. Continue monitoring evaluation metrics.
---
*Report generated by EvalVault on 2025-12-29 20:12:49*