Metadata-Version: 2.4
Name: pytecode
Version: 0.0.1
Summary: Python library for parsing, manipulating, and emitting JVM class files
Author: Trenton Smith
License-Expression: MIT
Project-URL: Homepage, https://github.com/smithtrenton/pytecode
Project-URL: Documentation, https://smithtrenton.github.io/pytecode/
Project-URL: Repository, https://github.com/smithtrenton/pytecode
Project-URL: Issues, https://github.com/smithtrenton/pytecode/issues
Project-URL: Changelog, https://github.com/smithtrenton/pytecode/releases
Keywords: bytecode,classfile,java,jvm
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.14
Classifier: Operating System :: OS Independent
Requires-Python: >=3.14
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: basedpyright<2.0.0,>=1.38.3; extra == "dev"
Requires-Dist: pdoc<17.0.0,>=16.0.0; extra == "dev"
Requires-Dist: pytest<10.0.0,>=9.0.2; extra == "dev"
Requires-Dist: ruff<0.16.0,>=0.15.7; extra == "dev"
Dynamic: license-file

# pytecode

`pytecode` is a Python 3.14+ library for parsing, inspecting, editing, validating, and emitting JVM class files and JAR archives.

It is built for Python tooling that needs direct access to Java bytecode: classfile readers and writers, archive rewriters, transformation pipelines, control-flow analysis, descriptor utilities, hierarchy-aware frame computation, and verification-oriented workflows.

## Why pytecode?

- Parse `.class` files into typed Python dataclasses.
- Edit classes, fields, methods, and bytecode through a mutable symbolic model.
- Rewrite JAR files while preserving non-class resources and ZIP metadata.
- Recompute `max_stack`, `max_locals`, and `StackMapTable` when requested.
- Validate parsed classfiles and edited models before emission.
- Work with descriptors, signatures, labels, symbolic operands, constant pools, and debug-info policies.

## Installation

Install from PyPI:

```bash
pip install pytecode
```

Or with `uv`:

```bash
uv add pytecode
```

`pytecode` requires Python `3.14+`.

## Quick start

### Parse and roundtrip a class file

```python
from pathlib import Path

from pytecode import ClassReader, ClassWriter

reader = ClassReader.from_file("HelloWorld.class")
classfile = reader.class_info

print(classfile.major_version)
print(classfile.methods_count)

Path("HelloWorld-copy.class").write_bytes(ClassWriter.write(classfile))
```

### Lift to the editable model

```python
from pathlib import Path

from pytecode import ClassModel

model = ClassModel.from_bytes(Path("HelloWorld.class").read_bytes())
print(model.name)

updated_bytes = model.to_bytes()
Path("HelloWorld-updated.class").write_bytes(updated_bytes)
```

Use `recompute_frames=True` when an edit changes control flow or stack/local layout.

## JAR rewriting example

`JarFile.rewrite()` can apply in-place transforms to matching classes and methods:

```python
from pytecode import JarFile
from pytecode.constants import MethodAccessFlag
from pytecode.model import ClassModel, MethodModel
from pytecode.transforms import (
    class_named,
    method_is_public,
    method_is_static,
    method_name_matches,
    on_methods,
    pipeline,
)


def make_final(method: MethodModel, _owner: ClassModel) -> None:
    method.access_flags |= MethodAccessFlag.FINAL


JarFile("input.jar").rewrite(
    "output.jar",
    transform=pipeline(
        on_methods(
            make_final,
            where=method_name_matches(r"main") & method_is_public() & method_is_static(),
            owner=class_named("HelloWorld"),
        )
    ),
)
```

Transforms must mutate models in place and return `None`. For code-shape changes, pass `recompute_frames=True`. For an ASM-like lift path that omits debug metadata, pass `skip_debug=True`.

## Public surface

Top-level exports:

- `pytecode.ClassReader` and `pytecode.ClassWriter` for raw classfile parsing and emission.
- `pytecode.JarFile` for archive reads, mutation, and safe rewrite-to-disk.
- `pytecode.ClassModel` for mutable editing with symbolic references.

Supported submodules:

- `pytecode.transforms` for composable class, field, method, and code transforms.
- `pytecode.labels` for label-aware bytecode editing helpers.
- `pytecode.operands` for symbolic operand wrappers.
- `pytecode.analysis` for CFG construction, frame simulation, and recomputation helpers.
- `pytecode.verify` for structural validation and diagnostics.
- `pytecode.hierarchy` for type and override resolution helpers.
- `pytecode.descriptors` for JVM descriptors and generic signatures.
- `pytecode.constant_pool_builder` for deterministic constant-pool construction.
- `pytecode.modified_utf8` for JVM Modified UTF-8 encoding and decoding.
- `pytecode.debug_info` for explicit debug-info preservation and stripping policies.

## Documentation

- Development docs overview: [docs/OVERVIEW.md](https://github.com/smithtrenton/pytecode/blob/master/docs/OVERVIEW.md)
- Hosted API reference: <https://smithtrenton.github.io/pytecode/>

## Development

Create a local environment with development tools:

```powershell
uv sync --extra dev
```

Common checks:

```powershell
uv run ruff check .
uv run ruff format --check .
uv run basedpyright
uv run pytest -q
uv run python tools\generate_api_docs.py --check
```

Generate local API reference HTML with:

```powershell
uv run python tools\generate_api_docs.py
```

Build source and wheel distributions locally:

```powershell
uv build
```

The `oracle`-marked CFG tests lazily cache ASM 9.7.1 test jars under `.pytest_cache\pytecode-oracle` and also honor manually seeded jars in `tests\resources\oracle\lib`. If `java`, `javac`, or the ASM jars are unavailable, that suite skips without failing the rest of the test run.

## Release automation

PyPI releases are published from GitHub Actions by pushing an immutable `v<version>` tag that matches `project.version` in `pyproject.toml`. The release workflow reruns validation on the tagged commit, builds both `sdist` and `wheel` with `uv build`, and publishes from the protected `pypi` environment via PyPI Trusted Publishing.

Release procedure:

```powershell
# 1) bump project.version in pyproject.toml
uv run ruff check .
uv run ruff format --check .
uv run basedpyright
uv run pytest -q
uv run python tools\generate_api_docs.py --check

git commit -am "Bump version to X.Y.Z"
git push origin master
git tag vX.Y.Z
git push origin vX.Y.Z
```

The release workflow rejects tags that do not match `project.version`. Treat release tags as immutable: if a tag or published artifact is wrong, bump to a new version and publish a new tag instead of force-pushing the old one. If the workflow fails before the publish step because of environment approval or a transient PyPI issue, rerun the workflow for the same tag instead of moving the tag.

## Repository utilities

`run.py` is a manual smoke-test helper that parses a JAR file, writes pretty-printed parsed class structures under `<jar parent>\output\<jar stem>\parsed\`, and writes class-model-derived rewritten `.class` files plus copied resources under `<jar parent>\output\<jar stem>\rewritten\`.

Example:

```powershell
uv run python .\run.py .\path\to\input.jar
```

The script prints read, parse, lift, write, and rewrite timings plus class and resource counts to stdout.
