神經網絡之所以能處理非線性問題,這歸功於激活函數的非線性表達能力,神經網絡的數學基礎是處處可微的。
dropout是一種激活函數(activation function),python中有若干種dropout函數,不盡相同。
dropout是為了防止或減輕過擬合而使用的函數,它一般用在全連接層。也有研究證明可以用在卷積層(小卷積核不適用)。
PyTorch中的dropout:概率參數p表示置零的概率
Tensorflow中的dropout:概率參數keep_prob表示保留的概率
torch.nn.Dropout
參考源碼:torch.nn.Dropout。
class torch.nn.Dropout(p: float = 0.5, inplace: bool = False)
# Input: (*). Input can be of any shape
# Output: (*). Output is of the same shape as input
torch.nn.functional.dropout
參考源碼:torch.nn.functional.dropout。
torch.nn.functional.dropout(input, p=0.5, training=True, inplace=False)
# During training, randomly zeroes some of the elements of the input tensor with probability p using samples from a Bernoulli distribution.
torch.nn.modules.dropout
參數中的p表示要設置為0的概率。
參考源碼:SOURCE CODE FOR TORCH.NN.MODULES.DROPOUT.
class Dropout(_DropoutNd):
r"""During training, randomly zeroes some of the elements of the input tensor with probability :attr:`p` using samples from a Bernoulli distribution. Each channel will be zeroed out independently on every forward call.
This has proven to be an effective technique for regularization and preventing the co-adaptation of neurons as described in the paper `Improving neural networks by preventing co-adaptation of feature detectors`_ .
Furthermore, the outputs are scaled by a factor of :math:`\frac{1}{1-p}` during training. This means that during evaluation the module simply computes an identity function.
Args:
p: probability of an element to be zeroed. Default: 0.5
inplace: If set to ``True``, will do this operation in-place. Default: ``False``
Shape:
- Input: :math:`(*)`. Input can be of any shape
- Output: :math:`(*)`. Output is of the same shape as input
Examples::
>>> m = nn.Dropout(p=0.2)
>>> input = torch.randn(20, 16)
>>> output = m(input)
.. _Improving neural networks by preventing co-adaptation of feature detectors: https://arxiv.org/abs/1207.0580 """
def forward(self, input: Tensor) -> Tensor:
return F.dropout(input, self.p, self.training, self.inplace)
tf.nn.dropout
dropout函數會以一個概率為keep_prob來決定神經元是否被抑制。如果被抑制,該神經元輸出為0,如果不被抑制則該神經元的輸出為輸入的1/keep_prob倍?
可參考Tensorflow中的dropout的使用方法。
def dropout(x, keep_prob, noise_shape=None, seed=None, name=None)
這里的keep_prob是保留概率,即要保留的結果所占比例,它作為一個place holder,在run時傳入, 當keep_prob=1的時候,相當於100%保留,也就是dropout沒有起作用。
上述函數中,x為浮點類型的tensor,keep_prob為浮點類型的scalar,范圍在(0,1]之間,表示x中的元素被保留下來的概率,noise_shape為一維的tensor(int32類型),表示標記張量的形狀(representing the shape for randomly generated keep/drop flags),並且noise_shape指定的形狀必須對x的形狀是可廣播的。如果x的形狀是[k, l, m, n],並且noise_shape為[k, l, m, n],那么x中的每一個元素是否保留都是獨立,但如果x的形狀是[k, l, m, n],並且noise_shape為[k, 1, 1, n],則x中的元素沿着第0個維度第3個維度以相互獨立的概率保留或者丟棄,而元素沿着第1個維度和第2個維度要么同時保留,要么同時丟棄。