Update reinforcement_q_learning: small grammer fix (#423)

KarthikNayak · soumith · commit 5fff87419e15 · 2019-01-26T19:10:08.000-05:00
The current explanation is a little confusing and had me read the paragraph twice to understand. _Very_ small change, but should improve readability.
diff --git a/intermediate_source/reinforcement_q_learning.py b/intermediate_source/reinforcement_q_learning.py
@@ -344,7 +344,7 @@ def select_action(state):
     steps_done += 1
     if sample > eps_threshold:
         with torch.no_grad():
-            # t.max(1) will return largest value for column of each row.
+            # t.max(1) will return largest column value of each row.
             # second column on max result is index of where max element was
             # found, so we pick action with the larger expected reward.
             return policy_net(state).max(1)[1].view(1, 1)