-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Description
Line 31 in 83a5f17
| function ReinforceCategorical:updateGradInput(input, gradOutput) |
Hi,
I am trying to understand the logic in reinforce implementation. I am new to this so please bear with my basic questions.
Why is gradOutput being ignored? If we multiply gradOutput with rewards, will it be wrong?
Also, what is happening here:self.gradInput:copy(self.output)? Output is a probability distribution right?
Thanks,
Parag
Metadata
Metadata
Assignees
Labels
No labels