tensorflow筆記3:CRF函數:tf.contrib.crf.crf_log_likelihood()


在分析訓練代碼的時候,遇到了,tf.contrib.crf.crf_log_likelihood,這個函數,於是想簡單理解下:

函數的目的:使用crf 來計算損失,里面用到的優化方法是:最大似然估計

使用方法:

tf.contrib.crf.crf_log_likelihood(inputs, tag_indices, sequence_lengths, transition_params=None)
See the guide: CRF (contrib)

Computes the log-likelihood of tag sequences in a CRF.

Args:
inputs: A [batch_size, max_seq_len, num_tags] tensor of unary potentials to use as input to the CRF layer.
tag_indices: A [batch_size, max_seq_len] matrix of tag indices for which we compute the log-likelihood.
sequence_lengths: A [batch_size] vector of true sequence lengths.
transition_params: A [num_tags, num_tags] transition matrix, if available. Returns:
log_likelihood: A scalar containing the log-likelihood of the given sequence of tag indices.
transition_params: A [num_tags, num_tags] transition matrix. This is either provided by the caller or created in this function.

函數講解:

1、tf.contrib.crf.crf_log_likelihood

crf_log_likelihood(inputs,tag_indices,sequence_lengths,transition_params=None)

在一個條件隨機場里面計算標簽序列的log-likelihood

參數:

inputs: 一個形狀為[batch_size, max_seq_len, num_tags] 的tensor,一般使用BILSTM處理之后輸出轉換為他要求的形狀作為CRF層的輸入. 
tag_indices: 一個形狀為[batch_size, max_seq_len] 的矩陣,其實就是真實標簽. 
sequence_lengths: 一個形狀為 [batch_size] 的向量,表示每個序列的長度. 
transition_params: 形狀為[num_tags, num_tags] 的轉移矩陣

返回:

log_likelihood: 標量,log-likelihood 
transition_params: 形狀為[num_tags, num_tags] 的轉移矩陣

 

2、tf.contrib.crf.viterbi_decode

viterbi_decode(score,transition_params) 
通俗一點,作用就是返回最好的標簽序列.這個函數只能夠在測試時使用,在tensorflow外部解碼

參數:

score: 一個形狀為[seq_len, num_tags] matrix of unary potentials. 
transition_params: 形狀為[num_tags, num_tags] 的轉移矩陣

 

返回:

viterbi: 一個形狀為[seq_len] 顯示了最高分的標簽索引的列表. 
viterbi_score: A float containing the score for the Viterbi sequence.

 

3、tf.contrib.crf.crf_decode

crf_decode(potentials,transition_params,sequence_length) 
在tensorflow內解碼

參數:

potentials: 一個形狀為[batch_size, max_seq_len, num_tags] 的tensor, 
transition_params: 一個形狀為[num_tags, num_tags] 的轉移矩陣 
sequence_length: 一個形狀為[batch_size] 的 ,表示batch中每個序列的長度

返回:

decode_tags:一個形狀為[batch_size, max_seq_len] 的tensor,類型是tf.int32.表示最好的序列標記. 
best_score: 有個形狀為[batch_size] 的tensor, 包含每個序列解碼標簽的分數.

 

轉載來自知乎:

如果你需要預測的是個序列,那么可以選擇用crf_log_likelihood作為損失函數

crf_log_likelihood(
inputs,
tag_indices,
sequence_lengths,
transition_params=None
)

 

輸入:

inputs:unary potentials,也就是每個標簽的預測概率值,這個值根據實際情況選擇計算方法,CNN,RNN...都可以

tag_indices,這個就是真實的標簽序列了

sequence_lengths,這是一個樣本真實的序列長度,因為為了對齊長度會做些padding,但是可以把真實的長度放到這個參數里

transition_params,轉移概率,可以沒有,沒有的話這個函數也會算出來

輸出:

log_likelihood,

transition_params,轉移概率,如果輸入沒輸,它就自己算個給返回

作者:知乎用戶
鏈接:https://www.zhihu.com/question/57666556/answer/326803900
來源:知乎
著作權歸作者所有。商業轉載請聯系作者獲得授權,非商業轉載請注明出處。

官方的示例代碼:如何使用crf來計算:

# !/home/wcg/tools/local/anaconda3/bin/python                                                                                                                                                                                                                                 
# coding=utf8
import numpy as np
import tensorflow as tf


#data settings
num_examples = 10
num_words = 20
num_features = 100 
num_tags = 5 

# 5 tags
#x shape = [10,20,100]
#random features.
x = np.random.rand(num_examples,num_words,num_features).astype(np.float32)

#y shape = [10,20]

#Random tag indices representing the gold sequence.
y = np.random.randint(num_tags,size = [num_examples,num_words]).astype(np.int32)

# 序列的長度
#sequence_lengths = [19,19,19,19,19,19,19,19,19,19]
sequence_lengths = np.full(num_examples,num_words - 1,dtype=np.int32)


#Train and evaluate the model.
with tf.Graph().as_default():
    with tf.Session() as session:
         # Add the data to the TensorFlow gtaph.
         x_t = tf.constant(x) #觀測序列
         y_t = tf.constant(y) # 標記序列
         sequence_lengths_t = tf.constant(sequence_lengths)
           
         # Compute unary scores from a linear layer.
         # weights shape = [100,5]
         weights = tf.get_variable("weights", [num_features, num_tags])
   
         # matricized_x_t shape = [200,100]
         matricized_x_t = tf.reshape(x_t, [-1, num_features])

         # compute                           [200,100]      [100,5]   get [200,5]
         # 計算結果
         matricized_unary_scores = tf.matmul(matricized_x_t, weights)
            
         #  unary_scores shape = [10,20,5]                  [10,20,5] 
         unary_scores = tf.reshape(matricized_unary_scores, [num_examples, num_words, num_tags])
         # compute the log-likelihood of the gold sequences and keep the transition
         # params for inference at test time.
         #                                                shape      shape   [10,20,5]   [10,20]   [10]
         log_likelihood,transition_params = tf.contrib.crf.crf_log_likelihood(unary_scores,y_t,sequence_lengths_t)

         viterbi_sequence, viterbi_score = tf.contrib.crf.crf_decode(unary_scores, transition_params, sequence_lengths_t) 
         # add a training op to tune the parameters.
         loss = tf.reduce_mean(-log_likelihood)
   
         # 定義梯度下降算法的優化器
         #learning_rate 0.01
         train_op = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
           
         #train for a fixed number of iterations.
         session.run(tf.global_variables_initializer())
     
         ''' 
        #eg:
        In [61]: m_20
        Out[61]: array([[ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12]])

        In [62]: n_20
        Out[62]: array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
         
        In [59]: n_20<m_20
        Out[59]: array([[ True,  True,  True,  True,  True,  True,  True,  True,  True, True]], dtype=bool)

         '''
         #這里用mask過濾掉不符合的結果
         mask = (np.expand_dims(np.arange(num_words), axis=0) < np.expand_dims(sequence_lengths, axis=1))
         
         ###mask = array([[ True,  True,  True,  True,  True,  True,  True,  True,  True, True]], dtype=bool)
         #序列的長度
         total_labels = np.sum(sequence_lengths)
         
         print ("mask:",mask)

         print ("total_labels:",total_labels)
         for i in range(1000):
             #tf_unary_scores,tf_transition_params,_ = session.run([unary_scores,transition_params,train_op])
             tf_viterbi_sequence,_=session.run([viterbi_sequence,train_op])
             if i%100 == 0:
                '''
                false*false = false  false*true= false ture*true = true
                '''
                #序列中預測對的個數
                correct_labels = np.sum((y==tf_viterbi_sequence) *  mask) 
                accuracy = 100.0*correct_labels/float(total_labels)
                print ("Accuracy: %.2f%%" %accuracy)

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM