-
Notifications
You must be signed in to change notification settings - Fork 0
5. Transformations
In the section 2. Network Architectures and Training, BaseNetwork objects require the input data to be
pre-normalised (if the data should be normalised) when returned from the dataset __getitem__.
To normalise the data, there are several child classes of BaseTransform from netloader.transforms that can
apply several transformations, un-transformations, and uncertainty forward and backward propagation.
Currently, the supported transformations are:
-
Index: Slices the input along a given dimension assuming the input meets the required shape -
Log: Logarithmic transform -
MinClamp: Clamps the minimum value to be the smallest positive value -
MultiTransform: Applies multiple transformations -
Normalise: Normalises the data to zero mean and unit variance, or between 0 and 1 -
NumpyTensor: Converts Numpy arrays to PyTorch tensors -
Reshape: Reshapes the data
Example code:
from netloader import transforms
from src.data import CustomDataset
# Create dataset
dataset = CustomDataset()
# Construct numpy to Tensor, then Log10 transformation
transform = transforms.MultiTransform(transforms.NumpyTensor(), transforms.Log())
# Transform dataset
transformed_data, transformed_uncertainties = transform(
dataset.data,
uncertainty=dataset.uncertainties,
)
# Untransform data
untransformed_data, untransformed_uncertainties = transform(
transformed_data,
back=True,
uncertainty=transformed_uncertainties,
)
assert (untransformed_data == dataset.data).all()
assert (untransformed_uncertainties == dataset.uncertainties).all()The BaseTransform is the parent class of all transforms; therefore, the methods of the class will be what all
transforms build off.
-
forward: Forward pass of the transformation-
x:ArrayLike, input array or tensor of shape (N,...), where N is the number of elements -
return:
ArrayLike, transformed array or tensor of shape (N,...)
-
-
backward: Backwards pass to invert the transformation-
x:ArrayLike, input array or tensor of shape (N,...), where N is the number of elements -
return:
ArrayLike, untransformed array or tensor of shape (N,...)
-
-
forward_grad: Forward pass of the transformation and uncertainty propagation-
x:ArrayLike, input array or tensor of shape (N,...), where N is the number of elements -
uncertainty:ArrayLike, uncertainty of the input array or tensor of shape (N,...) -
return:
tuple[ArrayLike, ArrayLike], transformed array or tensor of shape (N,...) and transformed uncertainty of shape (N,...)
-
-
backward_grad: Backwards pass to invert the transformation and uncertainty propagation-
x:ArrayLike, input array or tensor of shape (N,...), where N is the number of elements -
uncertainty:ArrayLike, uncertainty of the input array or tensor of shape (N,...) -
return:
tuple[ArrayLike, ArrayLike], untransformed array or tensor of shape (N,...) and untransformed uncertainty of shape (N,...)
-
The methods do not need to be explicitly called as the magic method __call__ will automatically call the corresponding
method depending upon if back is True or not, and if uncertainty is not None.
-
__call__: Calling function returns the forward, backwards or uncertainty propagation of the transformation-
x:ArrayLike, input array or tensor of shape (N,...), where N is the number of elements -
back:bool = False, if the inverse transformation should be applied -
uncertainty:ArrayLike | None = None, corresponding uncertainties for the input data for uncertainty propagation of shape (N,...) -
return:
ArrayLike | tuple[ArrayLike, ArrayLike], transformed array or tensor of shape (N,...) and propagated uncertainties of shape (N,...) if provided
-
-
__repr__: Representation of the transformation-
return:
str, representation string
-
return:
-
__getstate__: Returns a dictionary containing the state of the transformation for pickling-
return:
dict[str, Any], dictionary containing the state of the transformation
-
return:
-
__setstate__: Sets the state of the transformation for pickling-
state:dict[str, Any], dictionary containing the state of the transformation
-
Slices the input along a given dimension.
-
dim:int = -1, dimension to slice over -
in_shape:tuple[int, ...] | None = None, target shape ignoring batch size so that the slice only occurs if the input has the same shape to prevent repeated slicing, if any dimension has a shape of -1, then the size of the dimension will be ignored -
slice_:slice = slice(None), slicing object
Applies the logarithmic transform with a given base.
-
base:float = 10, base of the logarithm -
idxs:list[int] = None, indices to slice the last dimension to perform the log on
Clamps the minimum value to be the smallest positive value, useful before the Log transform as this prevents negative
or zero values.
-
dim:int = None, dimension to take the minimum value over -
idxs:list[int] = None, indices to slice the last dimension to perform the min clamp on
Applies multiple transformations.
Can be indexed or sliced to return the indexed transform or a child MultiTransform of the sliced transforms.
-
transforms:list[BaseTransform], list of transformations
-
*args:
BaseTransform, transformations
-
append: Appends a transform to the list of transforms-
transform:BaseTransform, transform to append to the list of transforms
-
Normalises the data to zero mean and unit variance, or between 0 and 1.
-
offset:ndarray, offset to subtract from the data -
scale:ndarray, scale to divide the data by
-
mean:bool = True, if data should be normalised to zero mean and unit variance, or between 0 and 1 -
dim:int | tuple[int, ...] | None = None, dimensions to normalise over, if None, all dimensions will be normalised over -
offset:ndarray | None = None, offset to subtract from the data if data argument is None -
scale:ndarray | None = None, scale to divide the data if data argument is None -
data:ArrayLike | None = None, data to normalise with shape (N,...), where N is the number of elements
If argument data is provided, arguments offset and scale will be ignored.
The forward pass converts Numpy arrays to PyTorch tensors, while the backwards pass converts tensors to arrays.
-
dtype:dtype = float32, data type of the tensor
-
dtype:dtype = float32, data type of the tensor
Reshapes the data
-
in_shape:list[int] | None = None, original shape of the data -
out_shape:list[int] | None = None, output shape of the data
If the transforms above do not fulfil all the transformation requirements, then you can extend the BaseTransform
class.
First, the new class must inherit BaseTransform, then define the initialisation method.
The __init__ method is defined as:
from netloader.transforms import BaseTransform
class Log(BaseTransform):
def __init__(self, base: float = 10):
super().__init__()
self._base: float = baseThen the forward and backward method can be defined for tensors and arrays, where the inverse of
from typing import TypeVar
from types import ModuleType
import torch
import numpy as np
from torch import Tensor
from numpy import ndarray
ArrayLike = TypeVar('ArrayLike', ndarray, Tensor)
def forward(self, x: ArrayLike) -> ArrayLike:
module: ModuleType = torch if isinstance(x, Tensor) else np
return module.log(x) / np.log(self._base)
def backward(self, x: ArrayLike) -> ArrayLike:
return self._base ** xFor uncertainty propagation, the forward_grad and backward_grad can be defined, where the uncertainty propagation
is
def forward_grad(
self,
x: ArrayLike,
uncertainty: ArrayLike) -> tuple[ArrayLike, ArrayLike]:
module: ModuleType = torch if isinstance(x, Tensor) else np
return self(x), module.abs(uncertainty / (x * np.log(self._base)))
def backward_grad(
self,
x: ArrayLike,
uncertainty: ArrayLike) -> tuple[ArrayLike, ArrayLike]:
module: ModuleType = torch if isinstance(x, Tensor) else np
x = self(x, back=True)
return x, module.abs(uncertainty * x * np.log(self._base))Finally, for save saving of the transform state, __getstate__ and __setstate__ can be defined to only save tensors,
primitive types, and dictionaries, such as the base of the logarithm as an integer:
def __getstate__(self):
return {'base': self._base}
def __setstate__(self, state):
self._base = state['base']Therefore, the full class is:
from typing import TypeVar
from types import ModuleType
import torch
import numpy as np
from torch import Tensor
from numpy import ndarray
from netloader.transforms import BaseTransform
ArrayLike = TypeVar('ArrayLike', ndarray, Tensor)
class Log(BaseTransform):
def __init__(self, base: float = 10):
super().__init__()
self._base: float = base
def __getstate__(self):
return {'base': self._base}
def __setstate__(self, state):
self._base = state['base']
def forward(self, x: ArrayLike) -> ArrayLike:
module: ModuleType = torch if isinstance(x, Tensor) else np
return module.log(x) / np.log(self._base)
def backward(self, x: ArrayLike) -> ArrayLike:
return self._base ** x
def forward_grad(
self,
x: ArrayLike,
uncertainty: ArrayLike) -> tuple[ArrayLike, ArrayLike]:
module: ModuleType = torch if isinstance(x, Tensor) else np
return self(x), module.abs(uncertainty / (x * np.log(self._base)))
def backward_grad(
self,
x: ArrayLike,
uncertainty: ArrayLike) -> tuple[ArrayLike, ArrayLike]:
module: ModuleType = torch if isinstance(x, Tensor) else np
x = self(x, back=True)
return x, module.abs(uncertainty * x * np.log(self._base))