We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent d464858 commit 5fff874Copy full SHA for 5fff874
intermediate_source/reinforcement_q_learning.py
@@ -344,7 +344,7 @@ def select_action(state):
344
steps_done += 1
345
if sample > eps_threshold:
346
with torch.no_grad():
347
- # t.max(1) will return largest value for column of each row.
+ # t.max(1) will return largest column value of each row.
348
# second column on max result is index of where max element was
349
# found, so we pick action with the larger expected reward.
350
return policy_net(state).max(1)[1].view(1, 1)
0 commit comments