26
26
* We build the tree bottom up
27
27
* Tag the root nodes (the words of the sentence)
28
28
* From there, use a neural network and the embeddings
29
-
30
- of the words to find combinations that form constituents. Whenever you
31
- form a new constituent, use some sort of technique to get an embedding
32
- of the constituent. In this case, our network architecture will depend
33
- completely on the input sentence. In the sentence "The green cat
34
- scratched the wall", at some point in the model, we will want to combine
35
- the span :math:`(i,j,r) = (1, 3, \text{NP})` (that is, an NP constituent
36
- spans word 1 to word 3, in this case "The green cat").
29
+ of the words to find combinations that form constituents. Whenever you
30
+ form a new constituent, use some sort of technique to get an embedding
31
+ of the constituent. In this case, our network architecture will depend
32
+ completely on the input sentence. In the sentence "The green cat
33
+ scratched the wall", at some point in the model, we will want to combine
34
+ the span :math:`(i,j,r) = (1, 3, \text{NP})` (that is, an NP constituent
35
+ spans word 1 to word 3, in this case "The green cat").
37
36
38
37
However, another sentence might be "Somewhere, the big fat cat scratched
39
38
the wall". In this sentence, we will want to form the constituent
68
67
# - Write the recurrence for the viterbi variable at step i for tag k.
69
68
# - Modify the above recurrence to compute the forward variables instead.
70
69
# - Modify again the above recurrence to compute the forward variables in
71
- # log-space (hint: log-sum-exp)
70
+ # log-space (hint: log-sum-exp)
72
71
#
73
72
# If you can do those three things, you should be able to understand the
74
73
# code below. Recall that the CRF computes a conditional probability. Let
128
127
torch .manual_seed (1 )
129
128
130
129
#####################################################################
131
-
132
130
# Helper functions to make the code more readable.
131
+
133
132
def to_scalar (var ):
134
133
# returns a python float
135
134
return var .view (- 1 ).data .tolist ()[0 ]
@@ -148,13 +147,13 @@ def prepare_sequence(seq, to_ix):
148
147
149
148
150
149
# Compute log sum exp in a numerically stable way for the forward algorithm
151
-
152
-
153
150
def log_sum_exp (vec ):
154
151
max_score = vec [0 , argmax (vec )]
155
152
max_score_broadcast = max_score .view (1 , - 1 ).expand (1 , vec .size ()[1 ])
156
153
return max_score + torch .log (torch .sum (torch .exp (vec - max_score_broadcast )))
157
154
155
+ #####################################################################
156
+ # Create model
158
157
159
158
class BiLSTM_CRF (nn .Module ):
160
159
@@ -293,6 +292,8 @@ def forward(self, sentence): # dont confuse this with _forward_alg above.
293
292
score , tag_seq = self ._viterbi_decode (lstm_feats )
294
293
return score , tag_seq
295
294
295
+ #####################################################################
296
+ # Run training
296
297
297
298
START_TAG = "<START>"
298
299
STOP_TAG = "<STOP>"
0 commit comments