It is thus impossible to use transpose(copy_to_cpu(m)). not a big issue but can sometimes be surprising.