Assignment3 依存分析

本文转载自查看原文 2020-08-07 22:30 581 CS224N

首先实现parser_transitions.py，接着实现parser_model.py，最后运行run.py进行展示。

1.parser_transitions.py

1.1PartialParse类

 1 class PartialParse(object):
 2     def __init__(self, sentence):
 3         """Initializes this partial parse.
 4 
 5         @param sentence (list of str): The sentence to be parsed as a list of words.
 6                                         Your code should not modify the sentence.
 7         """
 8         # The sentence being parsed is kept for bookkeeping purposes. Do not alter it in your code.
 9         self.sentence = sentence
10 
11         ### YOUR CODE HERE (3 Lines)
12         ### Your code should initialize the following fields:
13         ###     self.stack: The current stack represented as a list with the top of the stack as the last element of the list.
14         ###     self.buffer: The current buffer represented as a list with the first item on the buffer as the first item of the list
15         ###     self.dependencies: The list of dependencies produced so far. Represented as a list of
16         ###             tuples where each tuple is of the form (head, dependent).
17         ###             Order for this list doesn't matter.
18         ###
19         ### Note: The root token should be represented with the string "ROOT"
20         ###
21         self.stack = ['ROOT']
22         self.buffer = self.sentence.copy()
23         self.dependencies =[]
24         ### END YOUR CODE
25 
26 
27     def parse_step(self, transition):
28         """Performs a single parse step by applying the given transition to this partial parse
29 
30         @param transition (str): A string that equals "S", "LA", or "RA" representing the shift,
31                                 left-arc, and right-arc transitions. You can assume the provided
32                                 transition is a legal transition.
33         """
34         ### YOUR CODE HERE (~7-10 Lines)
35         ### TODO:
36         ###     Implement a single parsing step, i.e. the logic for the following as
37         ###     described in the pdf handout:
38         ###         1. Shift
39         ###         2. Left Arc
40         ###         3. Right Arc
41         if transition == 'S':
42             self.stack.append(self.buffer.pop(0))
43         elif transition == 'LA':
44             dependent = self.stack.pop(-2)
45             self.dependencies.append((self.stack[-1], dependent))
46         elif transition == 'RA':
47             dependent = self.stack.pop()
48             self.dependencies.append((self.stack[-1], dependent))
49         ### END YOUR CODE
50 
51 
52     def parse(self, transitions):
53         """Applies the provided transitions to this PartialParse
54 
55         @param transitions (list of str): The list of transitions in the order they should be applied
56 
57         @return dsependencies (list of string tuples): The list of dependencies produced when
58                                                         parsing the sentence. Represented as a list of
59                                                         tuples where each tuple is of the form (head, dependent).
60         """
61         for transition in transitions:
62             self.parse_step(transition)
63         return self.dependencies

1.2minibatch_parse函数

 1 def minibatch_parse(sentences, model, batch_size):
 2     """Parses a list of sentences in minibatches using a model.
 3 
 4     @param sentences (list of list of str): A list of sentences to be parsed
 5                                             (each sentence is a list of words and each word is of type string)
 6     @param model (ParserModel): The model that makes parsing decisions. It is assumed to have a function
 7                                 model.predict(partial_parses) that takes in a list of PartialParses as input and
 8                                 returns a list of transitions predicted for each parse. That is, after calling
 9                                     transitions = model.predict(partial_parses)
10                                 transitions[i] will be the next transition to apply to partial_parses[i].
11     @param batch_size (int): The number of PartialParses to include in each minibatch
12 
13 
14     @return dependencies (list of dependency lists): A list where each element is the dependencies
15                                                     list for a parsed sentence. Ordering should be the
16                                                     same as in sentences (i.e., dependencies[i] should
17                                                     contain the parse for sentences[i]).
18     """
19     dependencies = []
20 
21     ### YOUR CODE HERE (~8-10 Lines)
22     ### TODO:
23     ###     Implement the minibatch parse algorithm as described in the pdf handout
24     ###
25     ###     Note: A shallow copy (as denoted in the PDF) can be made with the "=" sign in python, e.g. unfinished_parses = partial_parses[:].
26     ###             Here `unfinished_parses` is a shallow copy of `partial_parses`.
27     ###             In Python, a shallow copied list like `unfinished_parses` does not contain new instances
28     ###             of the object stored in `partial_parses`. Rather both lists refer to the same objects.
29     ###             In our case, `partial_parses` contains a list of partial parses. `unfinished_parses`
30     ###             contains references to the same objects. Thus, you should NOT use the `del` operator
31     ###             to remove objects from the `unfinished_parses` list. This will free the underlying memory that
32     ###             is being accessed by `partial_parses` and may cause your code to crash.
33     partial_parses = [PartialParse(s) for s in sentences]            #为每个句子初始化PartialParse对象
34     unfinished_parses = partial_parses.copy()
35     
36     while unfinished_parses:
37         minibatch = unfinished_parses[:batch_size]            
38         transitions = model.predict(minibatch)               
39         for i, parse in enumerate(minibatch):                        #取出minibatch的每一个的parse来进行一次transition操作               
40             parse.parse_step(transitions[i])
41             if len(parse.stack)==1 and not parse.buffer:
42                 unfinished_parses.remove(parse)                      #过滤掉已经完成的parse
43                 
44     dependencies = [parse.dependencies for parse in partial_parses]  #获得所有的依赖 45     ### END YOUR CODE
46 
47     return dependencies

1.3测试

  1 def test_step(name, transition, stack, buf, deps,
  2               ex_stack, ex_buf, ex_deps):
  3     """Tests that a single parse step returns the expected output"""
  4     pp = PartialParse([])
  5     pp.stack, pp.buffer, pp.dependencies = stack, buf, deps
  6 
  7     pp.parse_step(transition)
  8     stack, buf, deps = (tuple(pp.stack), tuple(pp.buffer), tuple(sorted(pp.dependencies)))
  9     assert stack == ex_stack, \
 10         "{:} test resulted in stack {:}, expected {:}".format(name, stack, ex_stack)
 11     assert buf == ex_buf, \
 12         "{:} test resulted in buffer {:}, expected {:}".format(name, buf, ex_buf)
 13     assert deps == ex_deps, \
 14         "{:} test resulted in dependency list {:}, expected {:}".format(name, deps, ex_deps)
 15     print("{:} test passed!".format(name))
 16 
 17 
 18 def test_parse_step():
 19     """Simple tests for the PartialParse.parse_step function
 20     Warning: these are not exhaustive
 21     """
 22     test_step("SHIFT", "S", ["ROOT", "the"], ["cat", "sat"], [],
 23               ("ROOT", "the", "cat"), ("sat",), ())
 24     test_step("LEFT-ARC", "LA", ["ROOT", "the", "cat"], ["sat"], [],
 25               ("ROOT", "cat",), ("sat",), (("cat", "the"),))
 26     test_step("RIGHT-ARC", "RA", ["ROOT", "run", "fast"], [], [],
 27               ("ROOT", "run",), (), (("run", "fast"),))
 28 
 29 
 30 def test_parse():
 31     """Simple tests for the PartialParse.parse function
 32     Warning: these are not exhaustive
 33     """
 34     sentence = ["parse", "this", "sentence"]
 35     dependencies = PartialParse(sentence).parse(["S", "S", "S", "LA", "RA", "RA"])
 36     dependencies = tuple(sorted(dependencies))
 37     expected = (('ROOT', 'parse'), ('parse', 'sentence'), ('sentence', 'this'))
 38     assert dependencies == expected,  \
 39         "parse test resulted in dependencies {:}, expected {:}".format(dependencies, expected)
 40     assert tuple(sentence) == ("parse", "this", "sentence"), \
 41         "parse test failed: the input sentence should not be modified"
 42     print("parse test passed!")
 43 
 44 
 45 class DummyModel(object):
 46     """Dummy model for testing the minibatch_parse function
 47     """
 48     def __init__(self, mode = "unidirectional"):
 49         self.mode = mode
 50 
 51     def predict(self, partial_parses):
 52         if self.mode == "unidirectional":
 53             return self.unidirectional_predict(partial_parses)
 54         elif self.mode == "interleave":
 55             return self.interleave_predict(partial_parses)
 56         else:
 57             raise NotImplementedError()
 58 
 59     def unidirectional_predict(self, partial_parses):
 60         """First shifts everything onto the stack and then does exclusively right arcs if the first word of
 61         the sentence is "right", "left" if otherwise.
 62         """
 63         return [("RA" if pp.stack[1] is "right" else "LA") if len(pp.buffer) == 0 else "S"
 64                 for pp in partial_parses]
 65 
 66     def interleave_predict(self, partial_parses):
 67         """First shifts everything onto the stack and then interleaves "right" and "left".
 68         """
 69         return [("RA" if len(pp.stack) % 2 == 0 else "LA") if len(pp.buffer) == 0 else "S"
 70                 for pp in partial_parses]
 71 
 72 def test_dependencies(name, deps, ex_deps):
 73     """Tests the provided dependencies match the expected dependencies"""
 74     deps = tuple(sorted(deps))
 75     assert deps == ex_deps, \
 76         "{:} test resulted in dependency list {:}, expected {:}".format(name, deps, ex_deps)
 77 
 78 
 79 def test_minibatch_parse():
 80     """Simple tests for the minibatch_parse function
 81     Warning: these are not exhaustive
 82     """
 83 
 84     # Unidirectional arcs test
 85     sentences = [["right", "arcs", "only"],
 86                  ["right", "arcs", "only", "again"],
 87                  ["left", "arcs", "only"],
 88                  ["left", "arcs", "only", "again"]]
 89     deps = minibatch_parse(sentences, DummyModel(), 2)
 90     test_dependencies("minibatch_parse", deps[0],
 91                       (('ROOT', 'right'), ('arcs', 'only'), ('right', 'arcs')))
 92     test_dependencies("minibatch_parse", deps[1],
 93                       (('ROOT', 'right'), ('arcs', 'only'), ('only', 'again'), ('right', 'arcs')))
 94     test_dependencies("minibatch_parse", deps[2],
 95                       (('only', 'ROOT'), ('only', 'arcs'), ('only', 'left')))
 96     test_dependencies("minibatch_parse", deps[3],
 97                       (('again', 'ROOT'), ('again', 'arcs'), ('again', 'left'), ('again', 'only')))
 98 
 99     # Out-of-bound test
100     sentences = [["right"]]
101     deps = minibatch_parse(sentences, DummyModel(), 2)
102     test_dependencies("minibatch_parse", deps[0], (('ROOT', 'right'),))
103 
104     # Mixed arcs test
105     sentences = [["this", "is", "interleaving", "dependency", "test"]]
106     deps = minibatch_parse(sentences, DummyModel(mode="interleave"), 1)
107     test_dependencies("minibatch_parse", deps[0],
108                       (('ROOT', 'is'), ('dependency', 'interleaving'),
109                       ('dependency', 'test'), ('is', 'dependency'), ('is', 'this')))
110     print("minibatch_parse test passed!")
111 
112 
113 if __name__ == '__main__':
114     args = sys.argv
115     if len(args) != 2:
116         raise Exception("You did not provide a valid keyword. Either provide 'part_c' or 'part_d', when executing this script")
117     elif args[1] == "part_c":
118         test_parse_step()
119         test_parse()
120     elif args[1] == "part_d":
121         test_minibatch_parse()
122     else:
123         raise Exception("You did not provide a valid keyword. Either provide 'part_c' or 'part_d', when executing this script")

View Code

Tip：Spyder传入参数运行的方法：点击菜单栏中的运行->单文件配置->在命令行选项前打勾，后面空格处填入传入的参数即可。

当传入参数part_c时：

SHIFT test passed!
LEFT-ARC test passed!
RIGHT-ARC test passed!
parse test passed!

当传入参数part_d时：minibatch_parse test passed!

2.parser_model.py

实质上就是搭建一个三层的前馈神经网络，用ReLU做激活函数，最后一层用softmax输出，交叉熵做损失函数，同时还加了embedding层。

2.1ParserModel类

  1 class ParserModel(nn.Module):
  2     """ Feedforward neural network with an embedding layer and two hidden layers.
  3     The ParserModel will predict which transition should be applied to a
  4     given partial parse configuration.
  5 
  6     PyTorch Notes:
  7         - Note that "ParserModel" is a subclass of the "nn.Module" class. In PyTorch all neural networks
  8             are a subclass of this "nn.Module".
  9         - The "__init__" method is where you define all the layers and parameters
 10             (embedding layers, linear layers, dropout layers, etc.).
 11         - "__init__" gets automatically called when you create a new instance of your class, e.g.
 12             when you write "m = ParserModel()".
 13         - Other methods of ParserModel can access variables that have "self." prefix. Thus,
 14             you should add the "self." prefix layers, values, etc. that you want to utilize
 15             in other ParserModel methods.
 16         - For further documentation on "nn.Module" please see https://pytorch.org/docs/stable/nn.html.
 17     """
 18     def __init__(self, embeddings, n_features=36,
 19         hidden_size=200, n_classes=3, dropout_prob=0.5):
 20         """ Initialize the parser model.
 21 
 22         @param embeddings (ndarray): word embeddings (num_words, embedding_size)
 23         @param n_features (int): number of input features
 24         @param hidden_size (int): number of hidden units
 25         @param n_classes (int): number of output classes
 26         @param dropout_prob (float): dropout probability
 27         """
 28         super(ParserModel, self).__init__()
 29         self.n_features = n_features                              #一个单词有几个特征
 30         self.n_classes = n_classes                                #输出的类别数
 31         self.dropout_prob = dropout_prob
 32         self.embed_size = embeddings.shape[1]
 33         self.hidden_size = hidden_size                            #嵌入维度  34         self.embeddings = nn.Parameter(torch.tensor(embeddings))
 35 
 36         ### YOUR CODE HERE (~10 Lines)
 37         ### TODO:
 38         ###     1) Declare `self.embed_to_hidden_weight` and `self.embed_to_hidden_bias` as `nn.Parameter`.
 39         ###        Initialize weight with the `nn.init.xavier_uniform_` function and bias with `nn.init.uniform_`
 40         ###        with default parameters.
 41         ###     2) Construct `self.dropout` layer.
 42         ###     3) Declare `self.hidden_to_logits_weight` and `self.hidden_to_logits_bias` as `nn.Parameter`.
 43         ###        Initialize weight with the `nn.init.xavier_uniform_` function and bias with `nn.init.uniform_`
 44         ###        with default parameters.
 45         ###
 46         ### Note: Trainable variables are declared as `nn.Parameter` which is a commonly used API
 47         ###       to include a tensor into a computational graph to support updating w.r.t its gradient.
 48         ###       Here, we use Xavier Uniform Initialization for our Weight initialization.
 49         ###       It has been shown empirically, that this provides better initial weights
 50         ###       for training networks than random uniform initialization.
 51         ###       For more details checkout this great blogpost:
 52         ###             http://andyljones.tumblr.com/post/110998971763/an-explanation-of-xavier-initialization
 53         ###
 54         ### Please see the following docs for support:
 55         ###     nn.Parameter: https://pytorch.org/docs/stable/nn.html#parameters
 56         ###     Initialization: https://pytorch.org/docs/stable/nn.init.html
 57         ###     Dropout: https://pytorch.org/docs/stable/nn.html#dropout-layers
 58         self.embed_to_hidden = nn.Linear(self.n_features * self.embed_size, self.hidden_size)
 59         nn.init.xavier_normal_(self.embed_to_hidden.weight)
 60         nn.init.uniform_(self.embed_to_hidden.bias)
 61         
 62         self.dropout = nn.Dropout(self.dropout_prob)
 63         
 64         self.hidden_to_logits = nn.Linear(self.hidden_size, self.n_classes)
 65         nn.init.xavier_normal_(self.hidden_to_logits.weight)
 66         nn.init.uniform_(self.hidden_to_logits.bias)
 67         
 68         ### END YOUR CODE
 69 
 70     def embedding_lookup(self, w):              #选取一个张量里面索引(w)对应的元素
 71         """ Utilize `w` to select embeddings from embedding matrix `self.embeddings`
 72             @param w (Tensor): input tensor of word indices (batch_size, n_features)
 73 
 74             @return x (Tensor): tensor of embeddings for words represented in w
 75                                 (batch_size, n_features * embed_size)
 76         """
 77         ### YOUR CODE HERE (~1-3 Lines)
 78         ### TODO:
 79         ###     1) For each index `i` in `w`, select `i`th vector from self.embeddings
 80         ###     2) Reshape the tensor using `view` function if necessary
 81         ###
 82         ### Note: All embedding vectors are stacked and stored as a matrix. The model receives
 83         ###       a list of indices representing a sequence of words, then it calls this lookup
 84         ###       function to map indices to sequence of embeddings.
 85         ###
 86         ###       This problem aims to test your understanding of embedding lookup,
 87         ###       so DO NOT use any high level API like nn.Embedding
 88         ###       (we are asking you to implement that!). Pay attention to tensor shapes
 89         ###       and reshape if necessary. Make sure you know each tensor's shape before you run the code!
 90         ###
 91         ### Pytorch has some useful APIs for you, and you can use either one
 92         ### in this problem (except nn.Embedding). These docs might be helpful:
 93         ###     Index select: https://pytorch.org/docs/stable/torch.html#torch.index_select
 94         ###     Gather: https://pytorch.org/docs/stable/torch.html#torch.gather
 95         ###     View: https://pytorch.org/docs/stable/tensors.html#torch.Tensor.view
 96         
 97         x = self.embeddings[w]          #(batch_size, n_features, embed_size)
 98         x = x.view(x.shape[0], -1)      #(batch_size, n_features * embed_size)
 99         ### END YOUR CODE
100         
101         return x
102 
103 
104     def forward(self, w):
105         """ Run the model forward.
106 
107             Note that we will not apply the softmax function here because it is included in the loss function nn.CrossEntropyLoss
108 
109             PyTorch Notes:
110                 - Every nn.Module object (PyTorch model) has a `forward` function.
111                 - When you apply your nn.Module to an input tensor `w` this function is applied to the tensor.
112                     For example, if you created an instance of your ParserModel and applied it to some `w` as follows,
113                     the `forward` function would called on `w` and the result would be stored in the `output` variable:
114                         model = ParserModel()
115                         output = model(w) # this calls the forward function
116                 - For more details checkout: https://pytorch.org/docs/stable/nn.html#torch.nn.Module.forward
117 
118         @param w (Tensor): input tensor of tokens (batch_size, n_features)
119 
120         @return logits (Tensor): tensor of predictions (output after applying the layers of the network)
121                                  without applying softmax (batch_size, n_classes)
122         """
123         ### YOUR CODE HERE (~3-5 lines)
124         ### TODO:
125         ###     Complete the forward computation as described in write-up. In addition, include a dropout layer
126         ###     as decleared in `__init__` after ReLU function.
127         ###
128         ### Note: We do not apply the softmax to the logits here, because
129         ### the loss function (torch.nn.CrossEntropyLoss) applies it more efficiently.
130         ###
131         ### Please see the following docs for support:
132         ###     Matrix product: https://pytorch.org/docs/stable/torch.html#torch.matmul
133         ###     ReLU: https://pytorch.org/docs/stable/nn.html?highlight=relu#torch.nn.functional.relu
134         x = self.embedding_lookup(w)               #(batch_size, n_features * embed_size)
135         h = self.embed_to_hidden(x)                #(n_features * embed_size, hidden_size)
136         
137         h = F.relu(h)
138         h = self.dropout(h)
139         logits = self.hidden_to_logits(h)
140         ### END YOUR CODE
141         
142         return logits

2.2测试

 1 if __name__ == "__main__":
 2 
 3     parser = argparse.ArgumentParser(description='Simple sanity check for parser_model.py')
 4     parser.add_argument('-e', '--embedding', action='store_true', help='sanity check for embeding_lookup function')
 5     parser.add_argument('-f', '--forward', action='store_true', help='sanity check for forward function')
 6     args = parser.parse_args()
 7 
 8     embeddings = np.zeros((100, 30), dtype=np.float32)
 9     model = ParserModel(embeddings)
10 
11     def check_embedding():
12         inds = torch.randint(0, 100, (4, 36), dtype=torch.long)
13         selected = model.embedding_lookup(inds)
14         assert np.all(selected.data.numpy() == 0), "The result of embedding lookup: " \
15                                       + repr(selected) + " contains non-zero elements."
16 
17     def check_forward():
18         inputs =torch.randint(0, 100, (4, 36), dtype=torch.long)
19         out = model(inputs)
20         expected_out_shape = (4, 3)
21         assert out.shape == expected_out_shape, "The result shape of forward is: " + repr(out.shape) + \
22                                                 " which doesn't match expected " + repr(expected_out_shape)
23 
24     if args.embedding:
25         check_embedding()
26         print("Embedding_lookup sanity check passes!")
27 
28     if args.forward:
29         check_forward()
30         print("Forward sanity check passes!")

当传入参数-e时：Embedding_lookup sanity check passes!

当传入参数-f时：Forward sanity check passes!

3.run.py

3.1train函数

定义优化器和交叉熵损失函数。

 1 def train(parser, train_data, dev_data, output_path, batch_size=1024, n_epochs=10, lr=0.0005):
 2     """ Train the neural dependency parser.
 3 
 4     @param parser (Parser): Neural Dependency Parser
 5     @param train_data ():
 6     @param dev_data ():
 7     @param output_path (str): Path to which model weights and results are written.
 8     @param batch_size (int): Number of examples in a single batch
 9     @param n_epochs (int): Number of training epochs
10     @param lr (float): Learning rate
11     """
12     best_dev_UAS = 0
13 
14     ### YOUR CODE HERE (~2-7 lines)
15     ### TODO:
16     ###      1) Construct Adam Optimizer in variable `optimizer`
17     ###      2) Construct the Cross Entropy Loss Function in variable `loss_func` with `mean`
18     ###         reduction (default)
19     ###
20     ### Hint: Use `parser.model.parameters()` to pass optimizer
21     ###       necessary parameters to tune.
22     ### Please see the following docs for support:
23     ###     Adam Optimizer: https://pytorch.org/docs/stable/optim.html
24     ###     Cross Entropy Loss: https://pytorch.org/docs/stable/nn.html#crossentropyloss
25     optimizer = optim.Adam(parser.model.parameters(), lr=1e-3)
26     loss_func = nn.CrossEntropyLoss()
27 
28     ### END YOUR CODE
29 
30     for epoch in range(n_epochs):
31         print("Epoch {:} out of {:}".format(epoch + 1, n_epochs))
32         dev_UAS = train_for_epoch(parser, train_data, dev_data, optimizer, loss_func, batch_size)
33         if dev_UAS > best_dev_UAS:                  
34             best_dev_UAS = dev_UAS
35             print("New best dev UAS! Saving model.")
36             torch.save(parser.model.state_dict(), output_path)
37         print("")

3.2train_for_epoch函数

 1 def train_for_epoch(parser, train_data, dev_data, optimizer, loss_func, batch_size):
 2     """ Train the neural dependency parser for single epoch.
 3 
 4     Note: In PyTorch we can signify train versus test and automatically have
 5     the Dropout Layer applied and removed, accordingly, by specifying
 6     whether we are training, `model.train()`, or evaluating, `model.eval()`
 7 
 8     @param parser (Parser): Neural Dependency Parser
 9     @param train_data ():
10     @param dev_data ():
11     @param optimizer (nn.Optimizer): Adam Optimizer
12     @param loss_func (nn.CrossEntropyLoss): Cross Entropy Loss Function
13     @param batch_size (int): batch size
14 
15     @return dev_UAS (float): Unlabeled Attachment Score (UAS) for dev data
16     """
17     parser.model.train() # Places model in "train" mode, i.e. apply dropout layer
18     n_minibatches = math.ceil(len(train_data) / batch_size)
19     loss_meter = AverageMeter()
20 
21     with tqdm(total=(n_minibatches)) as prog:
22         for i, (train_x, train_y) in enumerate(minibatches(train_data, batch_size)):
23             optimizer.zero_grad()   # remove any baggage in the optimizer
24             loss = 0.               # store loss for this batch here
25             train_x = torch.from_numpy(train_x).long()
26             train_y = torch.from_numpy(train_y.nonzero()[1]).long()
27 
28             ### YOUR CODE HERE (~5-10 lines)
29             ### TODO:
30             ###      1) Run train_x forward through model to produce `logits`
31             ###      2) Use the `loss_func` parameter to apply the PyTorch CrossEntropyLoss function.
32             ###         This will take `logits` and `train_y` as inputs. It will output the CrossEntropyLoss
33             ###         between softmax(`logits`) and `train_y`. Remember that softmax(`logits`)
34             ###         are the predictions (y^ from the PDF).
35             ###      3) Backprop losses
36             ###      4) Take step with the optimizer
37             ### Please see the following docs for support:
38             ###     Optimizer Step: https://pytorch.org/docs/stable/optim.html#optimizer-step
39             logits = parser.model(train_x)
40             loss = loss_func(logits, train_y)
41             loss.backward()
42             optimizer.step()
43 
44             ### END YOUR CODE
45             prog.update(1)
46             loss_meter.update(loss.item())
47 
48     print ("Average Train Loss: {}".format(loss_meter.avg))
49 
50     print("Evaluating on dev set",)
51     parser.model.eval() # Places model in "eval" mode, i.e. don't apply dropout layer
52     dev_UAS, _ = parser.parse(dev_data)
53     print("- dev UAS: {:.2f}".format(dev_UAS * 100.0))
54     return dev_UAS

3.3测试

 1 if __name__ == "__main__":
 2     debug = args.debug
 3 
 4     assert (torch.__version__.split(".") >= ["1", "0", "0"]), "Please install torch version >= 1.0.0"
 5 
 6     print(80 * "=")
 7     print("INITIALIZING")
 8     print(80 * "=")
 9     parser, embeddings, train_data, dev_data, test_data = load_and_preprocess_data(debug)
10 
11     start = time.time()
12     model = ParserModel(embeddings)
13     parser.model = model
14     print("took {:.2f} seconds\n".format(time.time() - start))
15 
16     print(80 * "=")
17     print("TRAINING")
18     print(80 * "=")
19     output_dir = "results/{:%Y%m%d_%H%M%S}/".format(datetime.now())
20     output_path = output_dir + "model.weights"
21 
22     if not os.path.exists(output_dir):
23         os.makedirs(output_dir)
24 
25     train(parser, train_data, dev_data, output_path, batch_size=1024, n_epochs=10, lr=0.0005)
26 
27     if not debug:
28         print(80 * "=")
29         print("TESTING")
30         print(80 * "=")
31         print("Restoring the best model weights found on the dev set")
32         parser.model.load_state_dict(torch.load(output_path))
33         print("Final evaluation on test set",)
34         parser.model.eval()
35         UAS, dependencies = parser.parse(test_data)
36         print("- test UAS: {:.2f}".format(UAS * 100.0))
37         print("Done!")

================================================================================
INITIALIZING
================================================================================
Loading data...
took 3.21 seconds
Building parser...
took 1.41 seconds
Loading pretrained embeddings...
took 4.72 seconds
Vectorizing data...
took 1.84 seconds
Preprocessing training data...
took 56.26 seconds
0%| | 0/1848 [00:00<?, ?it/s]took 0.01 seconds

================================================================================
TRAINING
================================================================================
Epoch 1 out of 10
100%|██████████| 1848/1848 [03:54<00:00, 7.89it/s]
Average Train Loss: 0.1767250092627553
Evaluating on dev set
1445850it [00:00, 29785971.49it/s]
0%| | 0/1848 [00:00<?, ?it/s]- dev UAS: 84.95
New best dev UAS! Saving model.

Epoch 2 out of 10
100%|██████████| 1848/1848 [03:47<00:00, 8.12it/s]
Average Train Loss: 0.1093398475922741
Evaluating on dev set
1445850it [00:00, 16996881.76it/s]
0%| | 0/1848 [00:00<?, ?it/s]- dev UAS: 86.25
New best dev UAS! Saving model.

Epoch 3 out of 10
100%|██████████| 1848/1848 [04:21<00:00, 7.08it/s]
Average Train Loss: 0.09525894280926232
Evaluating on dev set
1445850it [00:00, 19918067.29it/s]
0%| | 0/1848 [00:00<?, ?it/s]- dev UAS: 87.61
New best dev UAS! Saving model.

Epoch 4 out of 10
100%|██████████| 1848/1848 [04:41<00:00, 6.56it/s]
Average Train Loss: 0.0863158397126572
Evaluating on dev set
1445850it [00:00, 19792149.63it/s]
0%| | 0/1848 [00:00<?, ?it/s]- dev UAS: 87.84
New best dev UAS! Saving model.

Epoch 5 out of 10
100%|██████████| 1848/1848 [04:18<00:00, 7.14it/s]
Average Train Loss: 0.07947457737986853
Evaluating on dev set
1445850it [00:00, 17618584.60it/s]
0%| | 0/1848 [00:00<?, ?it/s]- dev UAS: 87.97
New best dev UAS! Saving model.

Epoch 6 out of 10
100%|██████████| 1848/1848 [04:16<00:00, 7.21it/s]
Average Train Loss: 0.07394953573614488
Evaluating on dev set
1445850it [00:00, 16607289.49it/s]
0%| | 0/1848 [00:00<?, ?it/s]- dev UAS: 88.50
New best dev UAS! Saving model.

Epoch 7 out of 10
100%|██████████| 1848/1848 [04:50<00:00, 6.36it/s]
Average Train Loss: 0.06902175460583268
Evaluating on dev set
1445850it [00:00, 17512350.78it/s]
0%| | 0/1848 [00:00<?, ?it/s]- dev UAS: 88.31

Epoch 8 out of 10
100%|██████████| 1848/1848 [04:46<00:00, 6.45it/s]
Average Train Loss: 0.06499305106014078
Evaluating on dev set
1445850it [00:00, 11467000.04it/s]
0%| | 0/1848 [00:00<?, ?it/s]- dev UAS: 88.43

Epoch 9 out of 10
100%|██████████| 1848/1848 [04:43<00:00, 6.53it/s]
Average Train Loss: 0.06139369089220897
Evaluating on dev set
1445850it [00:00, 14742517.17it/s]
0%| | 0/1848 [00:00<?, ?it/s]- dev UAS: 88.36

Epoch 10 out of 10
100%|██████████| 1848/1848 [04:32<00:00, 6.78it/s]
Average Train Loss: 0.05800683121700888
Evaluating on dev set
1445850it [00:00, 24076379.68it/s]
- dev UAS: 88.11

================================================================================
TESTING
================================================================================
Restoring the best model weights found on the dev set
Final evaluation on test set
2919736it [00:00, 30076750.78it/s]
- test UAS: 88.73
Done!

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 依存分析 Dependency Parsing 依存句法分析依存句法分析与语义依存分析的区别 CS231n assignment3 Q3 Network Visualization: Saliency maps, Class Visualization, and Fooling Images 中文句法依存分析 Hanlp 依存句法分析 NLP（六）分块、句法分析、依存分析 hanlp学习七：依存句法分析中文依存句法分析---ddparser spacy依存句法分析标签