During my testing I haven't been able to get MUMPS to use more than a single core. Following the instructions in the MUMPS documentation I have set the env variable OMP_NUM_THREADS=12 (prior to starting Julia). On this same system, this approach of exporting OMP_NUM_THREADS works for telling Pardiso to use more cores. What else could I look into trying to get MUMPS to use parallelism in factorization and solving?