The comparison script will compile non-optimized -O0 arraycopy() w/ and w/o VSX load/store instruction. After it, perf record will be called against both versions and perf diff will show finally the differences in execution time:
$ ./compare.sh
-DVSX : compile with VSX instruction
-DMEMCPY: compile with memcpy() only, no arraycopy() incorporeted.
-DCHECK : enable check if destination data is equal to source data. This is not suitable to use when profiling