Introduction

已經有一段時間了，Softmax的問題沒有解決。比如分類的時候，看大家似乎都用的SoftmaxOutput作為Loss Op，傳入了兩個參數(i.e.: data，label)，照理說應該輸出loss的值；也就是說作為Loss應該輸出的是一個標量（bachsize=1的時候），然后我常會去想這樣做后期進行predict要花些功夫把原本的預測值找出來。但發現最后做Metric 的時候卻把output直接拿來和label對比，比如下面這段：

# python/mxnet/module/module.py     update_metric()
# --->
# python/build/lib.linux-x86_64-2.7/mxnet/module/executor_group.py
    def update_metric(self, eval_metric, labels):                                                                                             
        for texec, islice in zip(self.execs, self.slices):
            labels_slice = []
            for label, axis in zip(labels, self.label_layouts):
                if axis == 0:
                    # slicing NDArray along axis 0 can avoid copying
                    labels_slice.append(label[islice])
                elif axis > 0:
                    # pylint: disable=no-member
                    label_my_slice = nd.slice_axis(label, axis=axis, begin=islice.start,
                                                   end=islice.stop).as_in_context(label.context)
                    # pylint: enable=no-member
                    labels_slice.append(label_my_slice)
                else:
                    labels_slice.append(label)

            eval_metric.update(labels_slice, texec.outputs)

#   eval_metric   與  python/mxnet/metric.py  有關

class CustomMetric(EvalMetric):   # 隨機選一個參考
    ...
    def update(self, labels, preds):
        if not self._allow_extra_outputs:
            check_label_shapes(labels, preds)

        for pred, label in zip(preds, labels):
            label = label.asnumpy()
            pred = pred.asnumpy()

            if pred.shape[1] == 2:
                pred = pred[:, 1]

            reval = self._feval(label, pred)
            if isinstance(reval, tuple):
                (sum_metric, num_inst) = reval
                self.sum_metric += sum_metric
                self.num_inst += num_inst
            else:
                self.sum_metric += reval                                                                                                     
                self.num_inst += 1

最近又遇到了也走這條道的LogisticRegressionOutput，終於要打算解決了。
現在解決這個問題的一個優勢是，之前從MakeLoss中參到了一些，后面又發現了BlockGrad的功用。

Assumption

從這兩個Op可以大致猜到兩個Output吧

data的值做處理，結果作為output；
然后又對data與label的distancee進行了計算，目的是得到grad，這就是說真正的標量loss被掩蓋了，要的只是由此產生的gradient。

Experiments

做個驗證:


import mxnet as mx
import numpy as np
m=mx.nd.ones((3,4))
n=mx.nd.zeros((3,4))

vm=mx.sym.Variable('m')
vn=mx.sym.Variable('n')


out=mx.nd.LogisticRegressionOutput(m,n)   # 順手做下 NDArray 類型
out.asnumpy()
#array([[ 0.7310586,  0.7310586,  0.7310586,  0.7310586],
#       [ 0.7310586,  0.7310586,  0.7310586,  0.7310586],
#       [ 0.7310586,  0.7310586,  0.7310586,  0.7310586]], dtype=float32)            exp(1)  /  ( 1+exp(1) )


out=mx.sym.LogisticRegressionOutput(data=vm,label=vn)
exec_ = out.bind(mx.cpu(),{'m':m,'n':n})
exec_.forward()
exec_.outputs[0].asnumpy() 
#array([[ 0.7310586,  0.7310586,  0.7310586,  0.7310586],
#       [ 0.7310586,  0.7310586,  0.7310586,  0.7310586],
#       [ 0.7310586,  0.7310586,  0.7310586,  0.7310586]], dtype=float32)



m=mx.nd.ones((3,4))+.3
n=mx.nd.zeros((3,4))+0.2
out=mx.sym.LogisticRegressionOutput(data=vm,label=vn)
exec_ = out.bind(mx.cpu(),{'m':m,'n':n})
exec_.forward()
exec_.outputs[0].asnumpy()
#array([[ 0.78583497,  0.78583497,  0.78583497,  0.78583497],
#       [ 0.78583497,  0.78583497,  0.78583497,  0.78583497],
#       [ 0.78583497,  0.78583497,  0.78583497,  0.78583497]], dtype=float32)       exp(1.3) / (1+exp(1.3))



##########################
#   看看  SoftmaxOutput       這編輯器 orz...
##########################

m=mx.nd.ones((3,4))
n=mx.nd.zeros((3,4))+0.7        


out=mx.sym.SoftmaxOutput(data=vm,label=vn)
exec_ = out.bind(mx.cpu(),{'m':m,'n':n})
exec_.forward()
exec_.outputs[0].asnumpy()
#array([[ 0.25,  0.25,  0.25,  0.25],
#       [ 0.25,  0.25,  0.25,  0.25],
#       [ 0.25,  0.25,  0.25,  0.25]], dtype=float32)




m=mx.nd.ones((3,4))
n=mx.nd.zeros((3,4))+0.2                                       #  改變 label 的值
out=mx.sym.SoftmaxOutput(data=vm,label=vn)
exec_ = out.bind(mx.cpu(),{'m':m,'n':n})
exec_.forward()
exec_.outputs[0].asnumpy()   
#array([[ 0.25,  0.25,  0.25,  0.25],
#       [ 0.25,  0.25,  0.25,  0.25],
#       [ 0.25,  0.25,  0.25,  0.25]], dtype=float32)          輸出沒變



m=mx.nd.uniform(0,1,(3,4))                                    # 隨機數
n=mx.nd.zeros((3,4))+0.2                                       
out=mx.sym.SoftmaxOutput(data=vm,label=vn)
exec_ = out.bind(mx.cpu(),{'m':m,'n':n})
exec_.forward()
exec_.outputs[0].asnumpy()   
#array([[ 0.24227935,  0.21061647,  0.38156345,  0.16554071],
#       [ 0.37370193,  0.1872514 ,  0.20918882,  0.22985782],
#       [ 0.28395435,  0.28981918,  0.21832471,  0.20790176]], dtype=float32)

就是這樣。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 關於MXNet MXNet——symbol Mxnet編譯安裝 mxnet環境搭建隨記 Linux下安裝 mxnet MXNET：分類模型 MXNET：權重衰減 Mxnet Windows配置 Windows下編譯mxnet OpenVino的MXnet模型轉換