Metadata-Version: 2.4
Name: diff-code-change-range
Version: 0.0.1
Summary: Extract affected code structures from git diff output for Java and Kotlin files
Author-email: OpenSpec <openspec@example.com>
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: unidiff>=0.7.0
Requires-Dist: tree-sitter>=0.20.0
Requires-Dist: tree-sitter-java>=0.20.0
Requires-Dist: tree-sitter-kotlin>=0.3.0
Requires-Dist: tree-sitter-python>=0.21.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"

# diff-code-change-range

Extract affected code structures from git diff output for Java, Kotlin, and Python files.

## Overview

This tool parses `git diff --full-index -U999999` output, extracts complete before/after source code, and identifies which code structures (classes, methods, functions, members) are affected by the changes. It outputs structured YAML showing affected code structures for both before and after versions.

Additionally, the tool can extract **reference relationships** between affected nodes (method calls, field accesses, type references, etc.) to support impact analysis and precise code review.

## Installation

### From PyPI

```bash
pip install diff-code-change-range
```

### From Source

```bash
git clone <repository>
cd diff-code-change-range
pip install -e .
```

### Requirements

- Python 3.9 or higher
- Dependencies:
  - `unidiff>=0.7.0` - Parse unified diff format
  - `tree-sitter>=0.20.0` - Parse source code
  - `tree-sitter-java>=0.20.0` - Java grammar for tree-sitter
  - `tree-sitter-kotlin>=0.3.0` - Kotlin grammar for tree-sitter
  - `tree-sitter-python>=0.21.0` - Python grammar for tree-sitter
  - `pyyaml>=6.0` - YAML output

## Usage

### Basic Usage

Read diff from stdin:

```bash
git diff --full-index -U999999 | diff-code-change-range
```

Read diff from file:

```bash
diff-code-change-range path/to/diff.patch
```

### CLI Arguments

```
diff-code-change-range [-h] [-v] [diff_file]

Positional Arguments:
  diff_file         Path to diff file (default: read from stdin)

Optional Arguments:
  -h, --help        Show help message and exit
  -v, --version     Show version and exit
```

### Example Usage

```bash
# Generate diff with full context and pipe to tool
git diff --full-index -U999999 HEAD~1 | diff-code-change-range

# Save output to file
git diff --full-index -U999999 > changes.patch
diff-code-change-range changes.patch > analysis.yaml

# Use as Python module
python -m diff_code_change_range < changes.patch
```

### Python Example

Input Python file:
```python
class Calculator:
    def __init__(self):
        self.result = 0
    
    @classmethod
    def create(cls):
        return cls()
    
    async def compute(self):
        return self.result

def main():
    calc = Calculator()
```

Output structure:
```yaml
before: []
after:
  - name: calculator.py
    type: file
    line_range: [1, 13]
    children:
      - name: Calculator
        type: class
        line_range: [1, 10]
        children:
          - name: __init__
            type: method
            line_range: [2, 3]
            children:
              - name: result
                type: member
                line_range: [3, 3]
          - name: "@classmethod create"
            type: method
            line_range: [5, 6]
          - name: async compute
            type: method
            line_range: [8, 9]
      - name: main
        type: function
        line_range: [12, 13]
```

### Reference Extraction

The `reference` module can extract relationships between affected code nodes:

```python
from diff_code_change_range.reference import extract_references, AffectedScope, AffectedNode, NodeType

before_code = {
    "com/example/Service.kt": '''
class Service {
    fun process() {
        validate()
    }
    fun validate(): Boolean {
        return true
    }
}
'''
}

after_code = {...}  # After version

scope = AffectedScope(
    before=[...],  # AffectedNode tree
    after=[...]
)

result = extract_references(before_code, after_code, scope)

# Access references
for ref in result.before_references:
    print(f"{ref.source} -> {ref.target} ({ref.type.value})")

# See what changed
for ref in result.added_references:
    print(f"Added: {ref.source} -> {ref.target}")
```

Supported reference types:
- `method_call` - Method or function invocation
- `field_access` - Field or property access
- `type_reference` - Type usage in declarations
- `instantiation` - Object creation
- `annotation` - Annotation usage
- `inheritance` - Class extends another class
- `implementation` - Class implements interface

## Output Format

The tool outputs YAML with `before` and `after` root keys:

```yaml
before:
  - name: src/Calculator.java
    type: file
    line_range: [1, 15]
    children:
      - name: Calculator
        type: class
        line_range: [1, 14]
        children:
          - name: add
            type: method
            line_range: [5, 7]

after:
  - name: src/Calculator.java
    type: file
    line_range: [1, 15]
    children:
      - name: Calculator
        type: class
        line_range: [1, 14]
        children:
          - name: add
            type: method
            line_range: [5, 7]
```

### Node Types

- `file` - Source file
- `class` - Class declaration
- `interface` - Interface declaration
- `object` - Kotlin object declaration
- `enum` - Enum declaration
- `function` - Top-level function (Kotlin, Python)
- `method` - Class method
- `member` - Field/property, class variable, instance variable
- `variable` - Module-level variable (Python)

### Node Fields

- `name` - The name of the code element
- `type` - The type of node (see above)
- `line_range` - Array of [start_line, end_line] (1-based, inclusive)
- `children` - List of child nodes (for container types)

## Example

### Input Diff

```diff
diff --git a/src/Calculator.java b/src/Calculator.java
index abc123..def456 100644
--- a/src/Calculator.java
+++ b/src/Calculator.java
@@ -1,8 +1,8 @@
 public class Calculator {
     private int result;
     
-    public void add(int a) {
-        result += a;
+    public void add(int a, int b) {
+        result = a + b;
     }
     
     public int getResult() {
```

### Output

```yaml
before:
  - name: src/Calculator.java
    type: file
    line_range: [1, 8]
    children:
      - name: Calculator
        type: class
        line_range: [1, 8]
        children:
          - name: add
            type: method
            line_range: [4, 6]

after:
  - name: src/Calculator.java
    type: file
    line_range: [1, 8]
    children:
      - name: Calculator
        type: class
        line_range: [1, 8]
        children:
          - name: add
            type: method
            line_range: [4, 6]
```

## Development

### Setup Development Environment

```bash
# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install in development mode
pip install -e ".[dev]"
```

### Running Tests

```bash
# Run all tests
pytest

# Run with coverage
pytest --cov=diff_code_change_range

# Run specific test file
pytest tests/test_diff_parser.py
```

### Project Structure

```
.
├── src/
│   └── diff_code_change_range/
│       ├── __init__.py           # Package exports
│       ├── __main__.py           # Module entry point
│       ├── cli.py                # CLI implementation
│       ├── diff_parser.py        # Diff parsing module
│       ├── structure_extractor.py # Tree-sitter code parsing
│       ├── affected_marker.py    # Affected node detection
│       └── yaml_reporter.py      # YAML output generation
├── tests/
│   ├── fixtures/                 # Test diff files
│   ├── test_diff_parser.py
│   ├── test_structure_extractor.py
│   ├── test_affected_marker.py
│   ├── test_yaml_reporter.py
│   └── test_e2e.py
├── pyproject.toml
├── requirements.txt
├── requirements-dev.txt
└── README.md
```

## Supported Languages

- **Java** (`.java` files) - Full support for classes, interfaces, enums, methods, fields
- **Kotlin** (`.kt` files) - Full support for classes, objects, functions, properties
- **Python** (`.py` files) - Full support for classes, functions, methods, decorators, async functions, module-level and instance variables

Other file types are automatically skipped.

## Error Handling

The tool handles various error conditions gracefully:

- **Parse errors** - Files that fail to parse are skipped with a warning to stderr
- **Binary files** - Automatically detected and skipped
- **Non-Java/Kotlin/Python files** - Silently skipped
- **Empty diffs** - Produce empty output

Exit codes:
- `0` - Success
- `1` - Error (file not found, parse error, etc.)
- `130` - Interrupted (Ctrl+C)

## License

MIT License

## Contributing

Contributions are welcome! Please ensure tests pass before submitting pull requests.
