-
Notifications
You must be signed in to change notification settings - Fork 50
Open
Description
Hi,
I've noticed the following bug. When I create a shared copy SoftMaxTree node of size X, I end up with 1.5 * X memory instead of X.
Minimal working example for BillionWord hierarchy from dp package:
require 'torch'; require 'nn'; require 'nnx';
require 'cutorch'; require 'cunn'; require 'cutorch'
require 'dp'
function report_gpu_usage(comment)
local props = cutorch.getDeviceProperties(cutorch.getDevice())
local usage = props["totalGlobalMem"] - props["freeGlobalMem"]
print(string.format("GPU usage (%s): %.1fMB", comment, usage / 1024 ^ 2))
end
dataset = dp.BillionWords()
report_gpu_usage("initial")
sm = nn.SoftMaxTree(100, dataset:hierarchy() , dataset:rootId()):cuda()
report_gpu_usage("created softmax")
sm_sh_clone = sm:sharedClone()
report_gpu_usage("shared clone")
sm_clone = sm:clone()
report_gpu_usage("full clone")
collectgarbage()
report_gpu_usage("collectgarbage")Output:
GPU usage (initial): 80.4MB
GPU usage (created softmax): 772.7MB
GPU usage (shared clone): 1112.1MB
GPU usage (full clone): 1804.4MB
GPU usage (collectgarbage): 1804.4MB
As you can see, shared copy requires extra 1112 - 772 = 340MB. That's a number
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels