5. Transformations

Data Transformation using `transforms`

In the section 2. Network Architectures and Training, BaseNetwork objects require the input data to be pre-normalised (if the data should be normalised) when returned from the dataset __getitem__.
To normalise the data, there are several child classes of BaseTransform from netloader.transforms that can apply several transformations, un-transformations, and uncertainty forward and backward propagation.

Currently, the supported transformations are:

Index: Slices the input along a given dimension assuming the input meets the required shape
Log: Logarithmic transform
MinClamp: Clamps the minimum value to be the smallest positive value
MultiTransform: Applies multiple transformations
Normalise: Normalises the data to zero mean and unit variance, or between 0 and 1
NumpyTensor: Converts Numpy arrays to PyTorch tensors
Reshape: Reshapes the data

Example code:

from netloader import transforms

from src.data import CustomDataset

# Create dataset
dataset = CustomDataset()

# Construct numpy to Tensor, then Log10 transformation
transform = transforms.MultiTransform(transforms.NumpyTensor(), transforms.Log())

# Transform dataset
transformed_data, transformed_uncertainties = transform(
    dataset.data,
    uncertainty=dataset.uncertainties,
)

# Untransform data
untransformed_data, untransformed_uncertainties = transform(
    transformed_data,
    back=True,
    uncertainty=transformed_uncertainties,
)
assert (untransformed_data == dataset.data).all()
assert (untransformed_uncertainties == dataset.uncertainties).all()

`BaseTransform`

The BaseTransform is the parent class of all transforms; therefore, the methods of the class will be what all transforms build off.

Methods:

forward: Forward pass of the transformation
- x: ArrayLike, input array or tensor of shape (N,...), where N is the number of elements
- return: ArrayLike, transformed array or tensor of shape (N,...)
backward: Backwards pass to invert the transformation
- x: ArrayLike, input array or tensor of shape (N,...), where N is the number of elements
- return: ArrayLike, untransformed array or tensor of shape (N,...)
forward_grad: Forward pass of the transformation and uncertainty propagation
- x: ArrayLike, input array or tensor of shape (N,...), where N is the number of elements
- uncertainty: ArrayLike, uncertainty of the input array or tensor of shape (N,...)
- return: tuple[ArrayLike, ArrayLike], transformed array or tensor of shape (N,...) and transformed uncertainty of shape (N,...)
backward_grad: Backwards pass to invert the transformation and uncertainty propagation
- x: ArrayLike, input array or tensor of shape (N,...), where N is the number of elements
- uncertainty: ArrayLike, uncertainty of the input array or tensor of shape (N,...)
- return: tuple[ArrayLike, ArrayLike], untransformed array or tensor of shape (N,...) and untransformed uncertainty of shape (N,...)

The methods do not need to be explicitly called as the magic method __call__ will automatically call the corresponding method depending upon if back is True or not, and if uncertainty is not None.

Magic Methods:

__call__: Calling function returns the forward, backwards or uncertainty propagation of the transformation
- x: ArrayLike, input array or tensor of shape (N,...), where N is the number of elements
- back: bool = False, if the inverse transformation should be applied
- uncertainty: ArrayLike | None = None, corresponding uncertainties for the input data for uncertainty propagation of shape (N,...)
- return: ArrayLike | tuple[ArrayLike, ArrayLike], transformed array or tensor of shape (N,...) and propagated uncertainties of shape (N,...) if provided
__repr__: Representation of the transformation
- return: str, representation string
__getstate__: Returns a dictionary containing the state of the transformation for pickling
- return: dict[str, Any], dictionary containing the state of the transformation
__setstate__: Sets the state of the transformation for pickling
- state: dict[str, Any], dictionary containing the state of the transformation

`Index`

Slices the input along a given dimension.

Initialisation Arguments:

dim: int = -1, dimension to slice over
in_shape: tuple[int, ...] | None = None, target shape ignoring batch size so that the slice only occurs if the input has the same shape to prevent repeated slicing, if any dimension has a shape of -1, then the size of the dimension will be ignored
slice_: slice = slice(None), slicing object

`Log`

Applies the logarithmic transform with a given base.

Initialisation Arguments:

base: float = 10, base of the logarithm
idxs: list[int] = None, indices to slice the last dimension to perform the log on

`MinClamp`

Clamps the minimum value to be the smallest positive value, useful before the Log transform as this prevents negative or zero values.

Initialisation Arguments:

dim: int = None, dimension to take the minimum value over
idxs: list[int] = None, indices to slice the last dimension to perform the min clamp on

`MultiTransform`

Applies multiple transformations.
Can be indexed or sliced to return the indexed transform or a child MultiTransform of the sliced transforms.

Attributes:

transforms: list[BaseTransform], list of transformations

Initialisation Arguments:

*args: BaseTransform, transformations

Additional Methods:

append: Appends a transform to the list of transforms
- transform: BaseTransform, transform to append to the list of transforms

`Normalise`

Normalises the data to zero mean and unit variance, or between 0 and 1.

Attributes:

offset: ndarray, offset to subtract from the data
scale: ndarray, scale to divide the data by

Initialisation Arguments:

mean: bool = True, if data should be normalised to zero mean and unit variance, or between 0 and 1
dim: int | tuple[int, ...] | None = None, dimensions to normalise over, if None, all dimensions will be normalised over
offset: ndarray | None = None, offset to subtract from the data if data argument is None
scale: ndarray | None = None, scale to divide the data if data argument is None
data: ArrayLike | None = None, data to normalise with shape (N,...), where N is the number of elements

If argument data is provided, arguments offset and scale will be ignored.

`NumpyTensor`

The forward pass converts Numpy arrays to PyTorch tensors, while the backwards pass converts tensors to arrays.

Attributes:

dtype: dtype = float32, data type of the tensor

Initialisation Arguments:

dtype: dtype = float32, data type of the tensor

`Reshape`

Reshapes the data

Initialisation Arguments

in_shape: list[int] | None = None, original shape of the data
out_shape: list[int] | None = None, output shape of the data

Creating Custom Transforms

If the transforms above do not fulfil all the transformation requirements, then you can extend the BaseTransform class.

Creating `Log` from `BaseTransform`

First, the new class must inherit BaseTransform, then define the initialisation method.
The __init__ method is defined as:

from netloader.transforms import BaseTransform


class Log(BaseTransform):
    def __init__(self, base: float = 10):
        super().__init__()
        self._base: float = base

Then the forward and backward method can be defined for tensors and arrays, where the inverse of $f(x)=\log_b{x}$ is $x=b^{f(x)}$, where $b$ is the base:

from typing import TypeVar
from types import ModuleType

import torch
import numpy as np
from torch import Tensor
from numpy import ndarray

ArrayLike = TypeVar('ArrayLike', ndarray, Tensor)


def forward(self, x: ArrayLike) -> ArrayLike:
    module: ModuleType = torch if isinstance(x, Tensor) else np
    return module.log(x) / np.log(self._base)

def backward(self, x: ArrayLike) -> ArrayLike:
    return self._base ** x

For uncertainty propagation, the forward_grad and backward_grad can be defined, where the uncertainty propagation is $\sigma_f\approx\left|\frac{\sigma_x}{x\ln{b}}\right|$:

def forward_grad(
        self,
        x: ArrayLike,
        uncertainty: ArrayLike) -> tuple[ArrayLike, ArrayLike]:
    module: ModuleType = torch if isinstance(x, Tensor) else np
    return self(x), module.abs(uncertainty / (x * np.log(self._base)))

def backward_grad(
        self,
        x: ArrayLike,
        uncertainty: ArrayLike) -> tuple[ArrayLike, ArrayLike]:
    module: ModuleType = torch if isinstance(x, Tensor) else np
    x = self(x, back=True)
    return x, module.abs(uncertainty * x * np.log(self._base))

Finally, for save saving of the transform state, __getstate__ and __setstate__ can be defined to only save tensors, primitive types, and dictionaries, such as the base of the logarithm as an integer:

def __getstate__(self):
    return {'base': self._base}

def __setstate__(self, state):
    self._base = state['base']

Therefore, the full class is:

from typing import TypeVar
from types import ModuleType

import torch
import numpy as np
from torch import Tensor
from numpy import ndarray
from netloader.transforms import BaseTransform

ArrayLike = TypeVar('ArrayLike', ndarray, Tensor)


class Log(BaseTransform):
    def __init__(self, base: float = 10):
        super().__init__()
        self._base: float = base
        
    def __getstate__(self):
        return {'base': self._base}
  
    def __setstate__(self, state):
        self._base = state['base']

    def forward(self, x: ArrayLike) -> ArrayLike:
        module: ModuleType = torch if isinstance(x, Tensor) else np
        return module.log(x) / np.log(self._base)

    def backward(self, x: ArrayLike) -> ArrayLike:
        return self._base ** x

    def forward_grad(
            self,
            x: ArrayLike,
            uncertainty: ArrayLike) -> tuple[ArrayLike, ArrayLike]:
        module: ModuleType = torch if isinstance(x, Tensor) else np
        return self(x), module.abs(uncertainty / (x * np.log(self._base)))

    def backward_grad(
            self,
            x: ArrayLike,
            uncertainty: ArrayLike) -> tuple[ArrayLike, ArrayLike]:
        module: ModuleType = torch if isinstance(x, Tensor) else np
        x = self(x, back=True)
        return x, module.abs(uncertainty * x * np.log(self._base))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

5. Transformations

Data Transformation using `transforms`

`BaseTransform`

Methods:

Magic Methods:

`Index`

Initialisation Arguments:

`Log`

Initialisation Arguments:

`MinClamp`

Initialisation Arguments:

`MultiTransform`

Attributes:

Initialisation Arguments:

Additional Methods:

`Normalise`

Attributes:

Initialisation Arguments:

`NumpyTensor`

Attributes:

Initialisation Arguments:

`Reshape`

Initialisation Arguments

Creating Custom Transforms

Creating `Log` from `BaseTransform`

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

5. Transformations

Data Transformation using transforms

BaseTransform

Methods:

Magic Methods:

Index

Initialisation Arguments:

Log

Initialisation Arguments:

MinClamp

Initialisation Arguments:

MultiTransform

Attributes:

Initialisation Arguments:

Additional Methods:

Normalise

Attributes:

Initialisation Arguments:

NumpyTensor

Attributes:

Initialisation Arguments:

Reshape

Initialisation Arguments

Creating Custom Transforms

Creating Log from BaseTransform

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Data Transformation using `transforms`

`BaseTransform`

`Index`

`Log`

`MinClamp`

`MultiTransform`

`Normalise`

`NumpyTensor`

`Reshape`

Creating `Log` from `BaseTransform`