Skip to content

Commit 3350d21

Browse files
authored
Fix shared memory Table size and accesses reporting in GUPS benchmark (#60)
* Fix shared memory Table size and accesses reporting in GUPS benchmark Based on feedback from rkarim2 in issue #56: - For shared memory tests, Table size now shows the actual total shared memory used - Display shared memory allocation details (bytes per block × number of blocks) - Use correct accesses_per_elem_sh for shared memory tests - Report size in MB for shared memory vs GB for global memory This fixes the misleading Table size output that showed irrelevant global memory sizes (e.g., 4.2 GB) when running shared memory tests. * Remove comment per reviewer feedback Removed the comment on line 497 as requested by rkarim2 in PR review.
1 parent 72d62ec commit 3350d21

File tree

1 file changed

+23
-8
lines changed

1 file changed

+23
-8
lines changed

posts/gups/gups.cu

Lines changed: 23 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -484,14 +484,29 @@ int main(int argc, char* argv[])
484484
}
485485
size_t total_num_thread = thread * grid;
486486

487-
printf(
488-
"Table size = %zu (%lf GB.)\nTotal number of threads %zu\nEach thread "
489-
"access %d locations.\nNumber of iterations = %d\n",
490-
working_set,
491-
working_set * sizeof(benchtype) / 1e9,
492-
total_num_thread,
493-
accesses_per_elem,
494-
repeats);
487+
if (!shared_mem) {
488+
printf(
489+
"Table size = %zu (%lf GB.)\nTotal number of threads %zu\nEach thread "
490+
"access %d locations.\nNumber of iterations = %d\n",
491+
working_set,
492+
working_set * sizeof(benchtype) / 1e9,
493+
total_num_thread,
494+
accesses_per_elem,
495+
repeats);
496+
} else {
497+
size_t total_shmem = grid * n_shmem * sizeof(benchtype);
498+
printf(
499+
"Table size = %zu (%lf MB.) [shared memory: %zu bytes per block x %zu blocks]\n"
500+
"Total number of threads %zu\nEach thread "
501+
"access %d locations.\nNumber of iterations = %d\n",
502+
total_shmem / sizeof(benchtype),
503+
total_shmem / 1e6,
504+
n_shmem * sizeof(benchtype),
505+
grid,
506+
total_num_thread,
507+
accesses_per_elem_sh,
508+
repeats);
509+
}
495510

496511
benchtype* d_t;
497512
if (!shared_mem)

0 commit comments

Comments
 (0)