Metadata-Version: 2.1
Name: pelutils
Version: 0.5.2
Summary: Utility functions that are often useful
Home-page: https://github.com/peleiden/pelutils
Author: Søren Winkel Holm, Asger Laurits Schultz
Author-email: swholm@protonmail.com
License: BSD-3-Clause
Download-URL: https://pypi.org/project/pelutils/
Keywords: utility,logger,parser,profiling
Platform: UNKNOWN
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: gitpython
Requires-Dist: rich
Provides-Extra: ds
Requires-Dist: torch ; extra == 'ds'
Requires-Dist: matplotlib ; extra == 'ds'
Requires-Dist: scipy ; extra == 'ds'

# pelutils

Various utilities useful for python projects. Features include

- Feature-rich logger using `Rich` for colourful printing
- Parsing for combining config files and command-line arguments - especially useful for parametric methods
- Time taking and profiling
- Easy to use data storage class for easy data saving and loading
- Table formatting
- Miscellaneous standalone functions providing various functionalities - see `pelutils/__init__.py`
- Data-science submodule with extra utilities for statistics, plotting, and machine learning using `PyTorch`
- `unique` function similar to `np.unique` but in linear time (currently Linux x86_64 only)

## Parsing

A combination of parsing CLI and config file arguments which allows for a powerful, easy-to-use workflow.
Useful for parametric methods such as machine learning.

A file `main.py` could contain:
```py
options = {
    "location": { "default": "local_train", "help": "save_location", "type": str },
    "learning-rate": { "default": 1.5e-3, "help": "Controls size of parameter update", "type": float },
    "gamma": { "default": 1, "help": "Use of generator network in updating", "type": float },
    "initialize-zeros": { "help": "Whether to initialize all parameters to 0", "action": "store_true" },
}
parser = Parser(options)
experiments = parser.parse()
```

This could then by run by
`python main.py data/my-big-experiment --learning_rate 1e-5`
or by
`python main.py data/my-big-experiment --config cfg.ini`
where `cfg.ini` could contain

```
[DEFAULT]
gamma = 0.95
[RUN1]
learning-rate = 1e-4
initialize-zeros
[RUN2]
learning-rate = 1e-5
gamma = 0.9
```

## Logging

Easy to use logger which fits common needs.

```py
# Configure logger for the script
log.configure("path/to/save/log.log", "Title of log")

# Start logging
for i in range(70):  # Nice
    log("Execution %i" % i)

# Sections
log.section("New section in the logfile")

# Verbose logging for less important things
log.verbose("Will be logged")
with log.unverbose:
    log.verbose("Will not be logged")

# Error handling
# This explicitly logs a ValueError and then raises it
log.throw(ValueError("Your value is bad, and you should feel bad"))
# The zero-division error is logged
with log.log_errors:
    0 / 0

# User input
inp = log.input("WHAT... is your favourite colour? ")

# Log all logs from a function at the same time
# This is especially useful when using multiple threads so logging does not get mixed up
def fun():
    log("Hello there")
    log("General Kenobi!")
with mp.Pool() as p:
    p.map(collect_logs(fun), args)
```

## Time Taking and Profiling

Simple time taker inspired by Matlab Tic, Toc, which also has profiling tooling.

```py
tt = TickTock()
tt.tick()
<some task>
seconds_used = tt.tock()

for i in range(100):
    tt.profile("Repeated code")
    <some task>
    tt.profile("Subtask")
    <some subtask>
    tt.end_profile()
    tt.end_profile()
print(tt)  # Prints a table view of profiled code sections
```

## Data Storage

A data class that saves/loads its fields from disk.
Anything that can be saved to a `json` file will be.
Other data types will be saved to relevant file formats.
Currently, `numpy` arrays is the only supported data type that is not saved to the `json` file.

```py
@dataclass
class Person(DataStorage):
    name: str
    age: int
    numbers: np.ndarray
    subfolder = "older"
    json_name = "yoda.json"

yoda = Person(name="Yoda", age=900, numbers=np.array([69, 420]))
yoda.save("old")
# Saved data at old/older/yoda.json
# {
#     "name": "Yoda",
#     "age": 900
# }
# There will also be a file named numbers.npy
yoda = Person.load("old")
```

# pelutils.ds

This submodule contains various utility functions for data science and machine learning. To make sure the necessary requirements are installed, install using
```
pip install pelutils[ds]
```
Note that in some terminals, you will instead have to write
```
pip install pelutils\[ds\]
```

## PyTorch

All PyTorch functions work independently of whether CUDA is available or not.

```py
# Clear CUDA cache and synchronize
reset_cuda()

# Inference only: No gradients should be tracked in the following function
# Same as putting entire function body inside with torch.no_grad()
@no_grad
def infer():
    <code that includes feedforwarding>

# Feed forward in batches to prevent using too much memory
# Every time a memory allocation error is encountered, the number of batches is doubled
# Same as using y = net(x), but without risk of running out of memory
bff = BatchFeedForward(net, len(x))
y = bff(x)
# Change to another network
bff.update_net(net2)
```

## Statistics

Includes various commonly used statistical functions.

```py
# Get one sided z value for exponential(lambda=2) distribution with a significance level of 1 %
zval = z(alpha=0.01, two_sided=False, distribution=scipy.stats.expon(loc=1/2))

# Get correlation, confidence interval, and p value for two vectors
a, b = np.random.randn(100), np.random.randn(100)
r, lower_r, upper_r, p = corr_ci(a, b, alpha=0.01)
```

## Matplotlib

Contains predefined rc params, colours, and figure sizes.

```py
# Set wide figure size
plt.figure(figsize=figsize_wide)

# Use larger font for larger figures - works well with predefined figure sizes
update_rc_params(rc_params)

# 15 different, unique colours
c = iter(colours)
for i in range(15):
    plt.plot(x[i], y[i], color=next(c))
```



# History

## 0.5.2

- Allowed disabling printing by default in logger

## 0.5.1

- Fixed accidental rich formatting in logger
- Fixed logger crashing when not configured

## 0.5.0 - Breaking changes

- Added np.unique-style unique function to `ds` that runs in linear time but does not sort
- Replaced verbose/non-verbose logging with logging levels similar to built-in `logging` module
- Added `with_print` option to `log.__call__`
- Undid change from 0.3.4 such that `None` is now logged again
- Added `format` module. Currently supports tables
- Updated stringification of profiles to include percentage of parent profile
- Added `throws` function that checks if a functions throws an exception of a specific type
- Use `Rich` for printing to console when logging

## 0.4.1

- Added append mode to logger to append to old log files instead of overwriting

## 0.4.0

- Added `ds` submodule for data science and machine learning utilities

  This includes `PyTorch` utility functions, statistics, and `matplotlib` default values

## 0.3.4

- Logger now raises errors normally instead of using `throw` method

## 0.3.3

- `get_repo` now accepts a custom path search for repo as opposed to always using working dir

## 0.3.2

- `log.input` now also accepts iterables as input

  For such inputs, it will return a generator of user inputs

## 0.3.1 - Breaking changes

- Added functionality to logger for logging repository commit
- Removed function `get_commit`
- Added function `get_repo` which returns repository path and commit

  It attempts to find a repository by searching from working directory and upwards
- Updates to examples in `README` and other minor documentation changes
- `set_seeds` no longer returns seed, as this is already given as input to the function

## 0.3.0 - Breaking changes

- Only works for Python 3.7+

- If logger has not been configured, it now does no logging instead of crashing

  This prevents dependecies that use the logger to crash the program if it is not used
- `log.throw` now also logs the actual error rather than just the stack trace
- `log` now has public property `is_verbose`
- Fixed `with log.log_errors` always throwing errors
- Added code samples to `README`
- `Parser` no longer automatically determines if experiments should be placed in subfolders

  Instead, this is given explicitly as an argument to `__init__`

  It also supports boolean flags in the config file

## 0.2.13

- Readd clean method to logger

## 0.2.12 - Breaking changes

- The logger is now solely a global variable

  Different loggers are handled internally in the global _Logger instance

## 0.2.11

- Add catch property to logger to allow automatically logging errors with with
- All code is now indented using spaces

## 0.2.10

- Allow finer verbosity control in logger
- Allow multiple log commands to be collected and logged at the same time
- Add decorator for aforementioned feature
- Change thousand_seps from TickTock method to stand-alone function in `__init__`
- Verbose logging now has same signature as normal logging

## 0.2.8

- Add code to execute code with specific environment variables

## 0.2.7

- Fix error where the full stacktrace was not printed by log.throw
- set_seeds now checks if torch is available

  This means torch seeds are still set without needing it as a dependency

## 0.2.6 - Breaking changes

- Make Unverbose class private and update documentation
- Update formatting when using .input

## 0.2.5

- Add input method to logger

## 0.2.4

- Better logging of errors

## 0.2.1 - Breaking changes

- Removed torch as dependency

## 0.2.0 - Breaking changes

- Logger is now a global variable

  Logging should happen by importing the log variable and calling .configure to set it up

  To reset the logger, `.clean` can be called
- It is still possible to just import `Logger` and use it in the traditional way, though `.configure` should be called first
- Changed timestamp function to give a cleaner output
- `get_commit` now returns `None` if `gitpython` is not installed

## 0.1.2

- Update documentation for logger and ticktock
- Fix bug where seperator was not an argument to Logger.__call__

## 0.1.0

- Include DataStorage
- Logger can throw errors and handle seperators
- TickTock includes time handling and units
- Minor parser path changes

## 0.0.1

- Logger, Parser, TickTock added from previous projects


