we want to compare against feedfoward mu controller (exponential warm up) and fixed mu for ell () + mu*R(), see YAML file below:
https://github.com/marrlab/DomainLab/blob/fbopt/examples/benchmark/benchmark_fbopt_mnist_jigen.yaml
it is interesting to know how they behave in tensorboard regarding ell loss and R loss