gradOutput  Ignored

https://github.com/torch/rnn/blob/83a5f17e9f4fa469a3e910d0ec75833239d7ffdf/ReinforceCategorical.lua#L31
Hi,
I am trying to understand the logic in reinforce implementation. I am new to this so please bear with my basic questions.
Why is gradOutput being ignored? If we multiply gradOutput with rewards, will it be wrong?
Also, what is happening here:self.gradInput:copy(self.output)? Output is a probability distribution right?

Thanks,
Parag


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gradOutput Ignored #43

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

gradOutput Ignored #43

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions