-
Notifications
You must be signed in to change notification settings - Fork 37
Open
Description
Hey,
I found your repo via your blog post here: https://ernestyalumni.wordpress.com/2017/09/28/bringing-cuda-into-the-year-2011-c11-smart-pointers-with-cuda-cub-nccl-streams-and-cuda-unified-memory-management-with-cub-and-cublas/
I see you're using code like this:
// Allocate problem device arrays
auto deleter=[&](float* ptr){ cudaFree(ptr); };
std::shared_ptr<float> d_sh_in(new float[Lx], deleter);
cudaMalloc((void **) &d_sh_in, Lx * sizeof(float));
This code is very dangerous. You can't cast a std::shared_ptr to void** and pass it to cudaMalloc. It will appear to work, but it will not free your memory (due to internal implementation details of std::shared_ptr).
You need to use something like:
// Allocate problem device arrays
float * dPtr;
cudaMalloc(&dPtr, Lx * sizeof(float));
std::shared_ptr<float> d_sh_in(dPtr, [&](float* ptr){ cudaFree(ptr); });
akhilgeothom, neoblizz and espakmakhilgeothom
Metadata
Metadata
Assignees
Labels
No labels