Skip to content

Conversation

@DNA386
Copy link
Contributor

@DNA386 DNA386 commented Sep 16, 2025

Detach the tensors returned in the training loops to ensure the computation graph can be cleared from memory properly after each batch.

@y-richie-y
Copy link
Collaborator

As far as I know, .item() already converts the loss to a python scalar, hence detaching the value from the computation graph, so .detach() is redundant. Once loss goes out of scope, it can be garbage collected. Can you provide more information or give an explanation as to why this resolves the memory leak?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants