At the moment, the hyper-parameters are not re-optimized as new labeled patches are added to the training set. (number of epochs, data augmentation, batch size, etc., or even the size of the network)
We defined conservative hyper-parameters, optimized on the initial training set size, therefore, they might not be optimal as the training set grows.
We could consider re-optimizing the hyper-parameters after each active learning iteration.
Question: How to fairly compare with the baseline if the parameters are changing ?