Metadata-Version: 2.4
Name: codebase-stats
Version: 0.0.2
Summary: Comprehensive codebase analysis library with coverage, metrics, and test duration analysis
Author-email: Your Name <your.email@example.com>
License: MIT
Keywords: coverage,metrics,analysis,testing,code-quality
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: radon>=6.0.0
Requires-Dist: pytest-cov>=4.0
Requires-Dist: pytest-json-report>=1.5.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-mock>=3.0; extra == "dev"
Requires-Dist: ruff>=0.3.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: bandit>=1.7.0; extra == "dev"
Requires-Dist: pip-audit>=2.6.0; extra == "dev"

# Codebase Stats

[![CI/CD](https://github.com/brunolnetto/codebase-stats/actions/workflows/ci.yml/badge.svg)](https://github.com/brunolnetto/codebase-stats/actions)
[![Coverage](https://codecov.io/gh/brunolnetto/codebase-stats/branch/master/graph/badge.svg)](https://codecov.io/gh/brunolnetto/codebase-stats)
[![Python 3.12+](https://img.shields.io/badge/python-3.12%2B-blue)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

**Production-ready code quality metrics analysis library for Python projects.** Comprehensive analysis of coverage, complexity, maintainability, and code structure with automated reporting and quality gates.

## Quick Start

### Installation

```bash
# Using uv (recommended)
uv pip install codebase-stats

# Or with pip
pip install codebase-stats
```

### Basic Usage

```python
from codebase_stats import CodebaseStatsReporter

# Generate comprehensive metrics report
reporter = CodebaseStatsReporter(
    coverage_file='coverage.json',
    report_file='report.json',
    radon_root='src',
    fs_root='src',
    tree_root='src'
)

# Save report to file
reporter.save_report('metrics_report.txt', include_coverage=True, include_complexity=True)
```

### Command Line Interface

```bash
# Full analysis with all metrics
python cli.py coverage.json --radon-root src --fs-root src

# Show specific sections
python cli.py coverage.json --show coverage complexity mi

# List low-coverage files
python cli.py coverage.json --show list --threshold 80 --top 20

# Help
python cli.py --help
```

## Features

### 📊 Coverage Analysis
- **Statement & Branch Coverage**: Full coverage metrics from pytest-cov
- **Coverage Distribution**: Histogram visualization with percentiles (Q1, Q2, Q3, p90, p95, p99)
- **Low-Coverage Detection**: Identify files below thresholds with automatic prioritization
- **Pragma Tracking**: Track `# pragma: no cover` usage for documentation

### 🔍 Code Complexity
- **Cyclomatic Complexity (CC)**: Radon integration for function complexity analysis
  - Grade A: 1-5 (ideal)
  - Grade B: 6-10 (acceptable)  
  - Grade C+: 11+ (refactor recommended)
- **Maintainability Index (MI)**: Code readability/maintainability scoring (0-100)
- **Halstead Metrics**: Bug estimation and code volume analysis
- **Comment Ratios**: Documentation density analysis

### 📈 Code Metrics
- **File Size Distribution**: Line count per file with outlier detection
- **Directory Structure Analysis**: Module organization and hierarchy
- **Test Duration Distribution**: Identify slow tests
- **Quality Gates**: Automated threshold validation

### 📋 Reporting
- **Histogram Visualization**: ASCII histograms with customizable bins and scaling
- **Blame Sections**: Highlight problematic files (Q3 + 1.5×IQR threshold)
- **Percentile Analysis**: Q1, median, Q3, p90, p95, p99
- **Structured Output**: Markdown, text, or programmatic JSON

## Documentation

- **[Architecture Documentation](docs/ARCHITECTURE.md)** - System design, data flows, and component details
- **[Development Roadmap](development/ROADMAP.md)** - Release planning and feature roadmap
- **[Architecture Decision Records](development/adr)** - Design decisions and rationales
- **[Governance & Quality Gates](GOVERNANCE.md)** - Development workflow and CI/CD policies

## API Reference

### Core Classes

#### `CodebaseStatsReporter`
Main interface for generating comprehensive metrics reports.

```python
from codebase_stats import CodebaseStatsReporter

reporter = CodebaseStatsReporter(
    coverage_file: str,           # Path to coverage.json
    report_file: str = None,      # Path to pytest report.json (optional)
    radon_root: str = None,       # Root for CC/MI/comment analysis
    fs_root: str = None,          # Root for file size analysis
    tree_root: str = None         # Root for structure analysis
)

# Methods
reporter.save_report(filename, include_coverage=True, include_complexity=True)
reporter.get_stats() -> dict      # Get raw metrics dictionary
report_text = str(reporter)       # Generate text report
```

### Data Structure

#### Coverage Stats Dictionary
```python
stats = {
    "coverages_sorted": [float],                    # All file coverage %
    "proj_pct": float,                              # Project-wide coverage %
    "proj_total": int,                              # Total lines
    "proj_covered": int,                            # Covered lines
    "file_stats": [
        {
            "pct": float,                           # File coverage %
            "path": str,                            # File path
            "missing_count": int,                   # Missing line count
            "missing_lines": [int],                 # Missing line numbers
            "cc_avg": float,                        # Avg cyclomatic complexity
            "mi": float,                            # Maintainability index
            "comment_ratio": float,                 # Comment/SLOC ratio
            "hal_bugs": float,                      # Halstead bug estimate
            "size_lines": int,                      # File line count
            ...
        }
    ]
}
```

### Module APIs

#### `coverage.py` - Coverage Analysis
```python
from codebase_stats.coverage import load_coverage, precompute_coverage_stats

stats = load_coverage('coverage.json')
stats = precompute_coverage_stats(stats, radon_root='src')
```

#### `metrics.py` - Complexity Metrics
```python
from codebase_stats.metrics import get_cyclomatic_complexity, get_maintainability, get_comments_ratio

cc = get_cyclomatic_complexity('file.py')
mi = get_maintainability('file.py')
ratio = get_comments_ratio('file.py')
```

#### `radon.py` - Radon Integration
```python
from codebase_stats.radon import get_cc_list, get_mi_list, get_metrics

cc_data = get_cc_list('src')
mi_data = get_mi_list('src')
hal_data = get_metrics('src')
```

#### `reporter.py` - Report Generation
```python
from codebase_stats.reporter import CodebaseStatsReporter

reporter = CodebaseStatsReporter(...)
reporter.save_report('output.txt')  # Save formatted report
```

#### `utils.py` - Utilities
```python
from codebase_stats.utils import ascii_histogram, percentile, format_value

hist_str = ascii_histogram(data, bins=10, width=80)
p95 = percentile(data, 0.95)
formatted = format_value(value, decimals=2)
```

## Quality Gates

All code must meet these thresholds:

| Metric | Threshold | Rationale |
|--------|-----------|-----------|
| **Coverage** | 100% | All source code must be tested |
| **Cyclomatic Complexity** | ≤10 average | Grade B maintainability |
| **Maintainability Index** | ≥50 | Grade A minimum |
| **File Size** | ≤400 lines | Modules remain manageable |

## Development Workflow

### Setup
```bash
# Clone repository
git clone https://github.com/brunolnetto/codebase-stats.git
cd codebase-stats

# Install with dev dependencies
uv venv
uv pip install -e ".[dev]"

# Activate environment
source .venv/bin/activate
```

### Testing
```bash
# Run all tests
pytest

# With coverage
pytest --cov=codebase_stats --cov-report=term-plus

# Specific test file
pytest tests/test_coverage.py -v
```

### Code Quality
```bash
# Linting
ruff check codebase_stats/ tests/

# Format check
ruff format --check codebase_stats/ tests/

# Type checking
mypy codebase_stats/

# All quality checks
make quality
```

### Commits & PRs

This project uses **GitFlow** workflow with **conventional commits**:

```bash
# Feature branches
git checkout -b feat/feature-name
# Fix branches  
git checkout -b fix/issue-name
# Chore/documentation
git checkout -b chore/update-name

# Commit format: <type>(<scope>): <description>
git commit -m "feat(coverage): add pragma tracking"
git commit -m "fix(radon): handle empty files gracefully"
git commit -m "docs(readme): add API reference"
```

See [Development Policy](GOVERNANCE.md) for full workflow details.

## Architecture Highlights

**Data Flow:**
```
Raw Input (coverage.json, report.json)
    ↓
load_coverage() → Enrich with Radon metrics
    ↓
precompute_coverage_stats() → Compute distributions
    ↓
Display Functions → Histograms, tables, blame sections
    ↓
Reporter → Formatted text/markdown output
```

**Key Design Patterns:**
- **Single Responsibility**: Each module handles one analysis type
- **Composition**: Reporter combines multiple analysis modules
- **Lazy Evaluation**: Radon metrics computed on-demand
- **Immutable Data**: Stats dicts treated as read-only
- **Histogram Abstraction**: Consistent visualization across metrics

See [Architecture Documentation](docs/ARCHITECTURE.md) for detailed system design.

## Examples

### Generate Full Metrics Report
```bash
# After running tests
pytest --cov=src --cov-report=json

# Generate and save report
python cli.py coverage.json \
  --radon-root src \
  --fs-root src \
  --tree-root src \
  --report metrics_report.txt
```

### Analyze Coverage Gaps
```bash
python cli.py coverage.json --show coverage gaps
```

### Monitor Complexity Trends
```python
from codebase_stats import CodebaseStatsReporter

def check_complexity_trend():
    reporter = CodebaseStatsReporter('coverage.json', radon_root='src')
    stats = reporter.get_stats()
    
    cc_values = [f['cc_avg'] for f in stats['file_stats']]
    avg_cc = sum(cc_values) / len(cc_values)
    
    if avg_cc > 10:
        print(f"⚠️  Complexity increasing: {avg_cc:.2f}")
    else:
        print(f"✅ Complexity healthy: {avg_cc:.2f}")
```

## Contributing

1. See [Development Roadmap](development/ROADMAP.md) for planned work
2. Submit PRs against `develop` branch (GitFlow)
3. All PRs require passing quality gates and 100% test coverage
4. Follow [Governance](GOVERNANCE.md) for workflow details

## License

MIT License - see LICENSE file for details

## Metrics Definitions

| Metric | Range | Interpretation |
|--------|-------|-----------------|
| **Coverage** | 0-100% | Percentage of code lines executed by tests |
| **CC** | 1-50+ | Function branching complexity; A≤5, B≤10, C≤15, D≤20, E/F>20 |
| **MI** | 0-100 | Code readability/maintainability; A≥20, B≥10, C≥0 |
| **Comment Ratio** | 0-100% | Percentage of code that is comments/docstrings |
| **Halstead Bugs** | 0-N | Estimated number of bugs; lower is better |
| **File Size** | Lines | Module size; target ≤400 for maintainability |

---

**Status**: Production-ready · **Latest Release**: 1.0.0 · **Python**: 3.12+
