關於 tf.nn.softmax_cross_entropy_with_logits 及 tf.clip_by_value

本文轉載自查看原文 2017-03-30 22:16 1485 Machine Learning

In order to train our model, we need to define what it means for the model to be good. Well, actually, in machine learning we typically define what it means for a model to be bad. We call this the cost, or the loss, and it represents how far off our model is from our desired outcome. We try to minimize that error, and the smaller the error margin, the better our model is.

One very common, very nice function to determine the loss of a model is called "cross-entropy." Cross-entropy arises from thinking about information compressing codes in information theory but it winds up being an important idea in lots of areas, from gambling to machine learning. It's defined as:

H_{y^{'}} (y) = - \sum_{i} y_{i}^{'} \log (y_{i})

Where

To implement cross-entropy we need to first add a new placeholder to input the correct answers:

y_ = tf.placeholder(tf.float32, [None, 10])

Then we can implement the cross-entropy function,

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

First, tf.log computes the logarithm of each element of y. Next, we multiply each element of y_ with the corresponding element of tf.log(y). Then tf.reduce_sum adds the elements in the second dimension of y, due to the reduction_indices=[1] parameter. Finally, tf.reduce_mean computes the mean over all the examples in the batch.

Note that in the source code, we don't use this formulation, because it is numerically unstable. Instead, we apply tf.nn.softmax_cross_entropy_with_logits on the unnormalized logits (e.g., we call softmax_cross_entropy_with_logits on tf.matmul(x, W) + b), because this more numerically stable function internally computes the softmax activation. In your code, consider usingtf.nn.softmax_cross_entropy_with_logits instead.

大意是：如果使用 cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

來計算交叉熵，則需要使用 tf.clip_by_value 來使某些求 log 的值，因為 log 會產生 none (如 log-3 ), 用它來限定不出現none，具體使用方式如下：

cross_entropy = -tf.reduce_sum(y_*tf.log(tf.clip_by_value(y_conv, 1e-10, 1.0)))

但后來有人用了一個更好的方法來避免none：

cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv + 1e-10))

具體參見 http://stackoverflow.com/questions/33712178/tensorflow-nan-bug 的討論。

而如果直接用 tf.nn.softmax_cross_entropy_with_logits 則你再沒有上面的后顧之憂了，它自動解決了上面的問題。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 1、求loss：tf.reduce_mean（tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)） softmax_cross_entropy_with_logits sparse_softmax_cross_entropy_with_logits ValueError: Only call `softmax_cross_entropy_with_logits` with named arguments (labels=..., logits=..., ...) TensorFlow學習筆記之--[tf.clip_by_global_norm,tf.clip_by_value,tf.clip_by_norm等的區別] TensorFlow學習筆記之--[tf.clip_by_global_norm,tf.clip_by_value,tf.clip_by_norm等的區別] sigmoid_cross_entropy_with_logits tf.clip_by_value：將tensor中的0和NONE進行范圍限制的函數 tf.nn.softmax 分類【機器學習基礎】對 softmax 和 cross-entropy 求導