Metadata-Version: 2.1
Name: torchac
Version: 0.8.13
Summary: Fast Arithmetic Coding for PyTorch
Home-page: https://github.com/fab-jul/torchac
Author: fab-jul
Author-email: fabianjul@gmail.com
License: UNKNOWN
Platform: UNKNOWN
Requires-Python: >=3.6
Description-Content-Type: text/markdown

# torchac: Fast Arithmetic Coding for PyTorch

## About

This is a simplified version of the arithmetic coder we used in the 
neural compression paper "Practical Full Resolution Learned Lossless Image 
Compression", which
lives in the [L3C-Pytorch repo](https://github.com/fab-jul/L3C-PyTorch).
In particular, we removed the L3C-specific parts, which relied on CUDA
compliations and were tricky to get going.

The implementation is based on [this blog post](https://marknelson.us/posts/2014/10/19/data-compression-with-arithmetic-coding.html),
meaning that we implement _arithmetic coding_.
While it could be further optimized, it is already much faster than doing the equivalent thing in pure-Python (because of all the
 bit-shifts etc.). In L3C, Encoding an entire `512 x 512` image happens in 0.202s (see Appendix A in the paper).

### What torchac is

- A simple library to encode a stream of symbols into a bitstream given
  the cumulative distribution of the symbols.
- The number of possible symbols must be finite.

### What torchac is not

- We do not provide classes to learn or represent probability/cumulative
  distributions. These have to be provided by you.


## HowTo

### Set up conda environment

This library has been tested with
- PyTorch 1.7
- Python 3.8

And that's all you need. Other versions also may work.
If you don't have an environment setup, you can make one with `conda`:

```bash
# We use Python 3.8, other version may be supported.
conda create --name <YOUR_ENV_NAME> python==3.8

conda activate <YOUR_ENV_NAME>

# Installing pytorch
Find conda command for your system: https://pytorch.org
```

#### Test installation

To (optionally) test your installation, you need `pytest`:

```bash
# If you don't have pytest
pip install pytest

# Run tests
python -m pytest test.py -s
```

Output should end in something like:
```bash
===== 5 passed, 2 warnings in 0.95s =========
```

### Example

The `examples/` folder contains [an example for training an auto-encoder on MNIST](https://github.com/fab-jul/torchac/tree/master/examples/mnist_autoencoder).

</div>

## FAQ

#### 1. Output is not equal to the input

Either normalization gone wrong or you encoded a symbol that is `>Lp`.

## Important Implementation Details

### How we represent probability distributions.

The probabilities are specified as [CDFs](https://en.wikipedia.org/wiki/Cumulative_distribution_function).
For each possible symbol,
we need 2 CDF values. This means that if there are `L` possible symbols
`{0, ..., L-1}`, the CDF must specified the value for `L+1` symbols.

**Example**:
```
Let's say we have L = 3 possible symbols. We need a CDF with 4 values
to specify the symbols distribution:

symbol:        0     1     2
cdf:       C_0   C_1   C_2   C_3

This corresponds to the 3 probabilities

P(0) = C_1 - C_0
P(1) = C_2 - C_1
P(2) = C_3 - C_2

NOTE: The arithmetic coder assumes that C_3 == 1. 
```

Important:

- If you have `L` possible symbols, you need to pass a CDF that
  specifies `L + 1` values. Since this is a common number, we call it 
  `Lp = L + 1` throught the code (the "p" stands for prime, i.e., `L'`).
- The last value of the CDF should be `1`. Note that the arithmetic coder
  in `torchac.cpp` will just assume it's `1` regardless of what is passed, so not having a CDF
  that ends in `1` will mean you will estimate bitrates wrongly. More details below.
- Note that even though the CDF specifies `Lp` values, symbols are only allowed
to be in `{0, ..., Lp-2}`. In the above example, `Lp == 4`, but the 
max symbols is `Lp-2 == 2`. Bigger values will yield **wrong outputs**

### Expected input shapes

We allow any shapes for the inputs, but the spatial dimensions of the
input CDF and the input symbols must match. In particular, we expect:

- CDF must have shape `(N1, ..., Nm, Lp)`, where `N1, ..., Nm` are the
`m` spatial dimensions, and `Lp` is as described above.
- Symbols must have shape `(N1, ..., Nm)`, i.e., same spatial dimensions
as the CDF.

For example, in a typical CNN, you might have a CDF of shape 
`(batch, channels, height, width, Lp)`.


### Normalized vs. Unnormalized / Floating Point vs. Integer CDFs

The library differentiates between "normalized" and "unnormalized" CDFs,
and between "floating point" and "integer" CDFs. What do these mean?

- A proper CDF is strictly monotonically increasing, and we call this a
"normalized" CDF. 
- However, since we work with finite precision (16 bits to
be precise in this implementation), it may be that you have a CDF that
is strictly monotonically increasing in `float32` space, but not when
it is converted to 16 bit precision. An "unnormalized" CDF is what we call
a CDF that has the same value for at least two subsequent elements.
- "floating point" CDFs are CDFs that are specified as `float32` and need
to be converted to 16 bit precision
- "integer" CDFs are CDFs specified as `int16` - BUT are then interpreted
as `uint16` on the C++ side. See "int16 vs uint16" below.

Examples:

```python
float_unnormalized_cdf = [0.1, 0.2, 0.2, 0.3, ..., 1.]
float_normalized_cdf = [0.1, 0.2, 0.20001, 0.3, ..., 1.]
integer_unnormalized_cdf = [10, 20, 20, 30, ..., 0]  # See below for why last is 0.
integer_normalized_cdf = [10, 20, 21, 30, ..., 0]    # See below for why last is 0.
```

There are two APIs:

- `encode_float_cdf` and `decode_float_cdf` is to be used for floating point 
CDFs. These functions have a flag `needs_normalization` that specifies
whether the input is assumed to be normalized. You can set
`need_normalization=False` if you have CDFs that you know are normalized, e.g., 
Gaussian distributions with a large enough sigma. This would then speedup
encoding and decoding large tensors somewhat, and will make bitrate 
estimation from the CDF more precise.
- `encode_int16_normalized_cdf` and `decode_int16_normalized_cdf` is to be 
used for integer CDFs **that are already normalized**.

### int16 vs uint16 - it gets confusing!

One big source of confusion can be that PyTorch does not support `uint16`.
Yet, that's exactly what we need. So what we do is we just represent
integer CDFs with `int16` in the Python side, and interpret/cast them to `uint16`
on the C++ side. This means that if you were to look at the int16 CDFs
you would see confusing things:

```python 
# Python
cdf_float = [0., 1/3, 2/3, 1.]  # A uniform distribution for L=3 symbols.
cdf_int = [0, 21845, -21845, 0]

# C++
uint16* cdf_int = [0, 21845, 43690, 0]
```

Note:
1. In the python `cdf_int` numbers bigger than `2**16/2` are negative
2. The final value is actually 0. This is then handled in `torchac.cpp` which
just assums `cdf[..., -1] == 2**16`, which cannot be represented as a `uint16`.

Fun stuff!


