基本運算

add/minus/multiply/divide
matmul
pow
sqrt/rsqrt
round

基礎運算

可以使用 + - * / 推薦
也可以使用 torch.add, mul, sub, div

In[3]: a = torch.rand(3,4)
In[4]: b = torch.rand(4)		# 使用broadcast
In[5]: a+b
Out[5]: 
tensor([[0.9463, 1.3325, 1.0427, 1.3508],
        [1.8552, 0.5614, 0.8546, 1.2186],
        [1.4794, 1.3745, 0.7024, 1.1688]])
In[6]: torch.add(a,b)
Out[6]: 
tensor([[0.9463, 1.3325, 1.0427, 1.3508],
        [1.8552, 0.5614, 0.8546, 1.2186],
        [1.4794, 1.3745, 0.7024, 1.1688]])
In[8]: torch.all(torch.eq((a-b),torch.sub(a,b)))	
Out[8]: tensor(1, dtype=torch.uint8)
In[9]: torch.all(torch.eq((a*b),torch.mul(a,b)))
Out[9]: tensor(1, dtype=torch.uint8)
In[10]: torch.all(torch.eq((a/b),torch.div(a,b)))
Out[10]: tensor(1, dtype=torch.uint8)

torch.all() 判斷每個位置的元素是否相同

是否存在為0的元素

In[21]: torch.all(torch.ByteTensor([1,1,1,1]))
Out[21]: tensor(1, dtype=torch.uint8)
In[22]: torch.all(torch.ByteTensor([1,1,1,0]))
Out[22]: tensor(0, dtype=torch.uint8)

matmul

matmul 表示 matrix mul
* 表示的是element-wise
torch.mm(a,b) 只能計算2D 不推薦
torch.matmul(a,b) 可以計算更高維度，落腳點依舊在行與列。推薦
@ 是matmul 的重載形式

In[24]: a = 3*torch.ones(2,2)
In[25]: a
Out[25]: 
tensor([[3., 3.],
        [3., 3.]])
In[26]: b = torch.ones(2,2)
In[27]: torch.mm(a,b)
Out[27]: 
tensor([[6., 6.],
        [6., 6.]])
In[28]: torch.matmul(a,b)
Out[28]: 
tensor([[6., 6.],
        [6., 6.]])
In[29]: [email protected]
Out[29]: 
tensor([[6., 6.],
        [6., 6.]])

例子

線性層的計算： x @ w.t() + b

x是4張照片且已經打平了 (4, 784)
我們希望 (4, 784) —> (4, 512)
這樣的話w因該是 (784, 512)
但由於pytorch默認第一個維度是 channel-out（目標），第二個維度是 channel-in （輸入），所以需要用一個轉置

note：.t() 只適合2D，高維用transpose

In[31]: x = torch.rand(4,784)
In[32]: w = torch.rand(512,784)
In[33]: ([email protected]()).shape
Out[33]: torch.Size([4, 512])

神經網絡 -> 矩陣運算 -> tensor flow

2維以上的tensor matmul

對於2維以上的matrix multiply ， torch.mm(a,b)就不行了。
運算規則：只取最后的兩維做矩陣乘法
對於 [b, c, h, w] 來說，b,c 是不變的，圖片的大小在改變；並且也並行的計算出了b，c。也就是支持多個矩陣並行相乘。
對於不同的size，如果符合broadcast，先執行broadcast，在進行矩陣相乘。

In[3]: a = torch.rand(4,3,28,64)
In[4]: b = torch.rand(4,3,64,32)
In[5]: torch.mm(a,b).shape
RuntimeError: matrices expected, got 4D, 4D tensors at ..\aten\src\TH/generic/THTensorMath.cpp:956
In[6]: torch.matmul(a,b).shape
Out[6]: torch.Size([4, 3, 28, 32])
In[7]: b = torch.rand(4,1,64,32)	
In[8]: torch.matmul(a,b).shape	# 進行了broadcast
Out[8]: torch.Size([4, 3, 28, 32])
In[9]: b = torch.rand(4,64,32)
In[10]: torch.matmul(a,b).shape
RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1

power

pow(a, n) a的n次方
** 也表示次方（可以是2，0.5，0.25，3）推薦
sqrt() 表示 square root 平方根
rsqrt() 表示平方根的倒數

In[11]: a = torch.full([2,2],3)
In[12]: a.pow(2)
Out[12]: 
tensor([[9., 9.],
        [9., 9.]])
In[13]: a**2
Out[13]: 
tensor([[9., 9.],
        [9., 9.]])
In[14]: aa = a**2
In[15]: aa.sqrt()
Out[15]: 
tensor([[3., 3.],
        [3., 3.]])
In[16]: aa.rsqrt()
Out[16]: 
tensor([[0.3333, 0.3333],
        [0.3333, 0.3333]])
In[17]: aa**(0.5)
Out[17]: 
tensor([[3., 3.],
        [3., 3.]])

Exp log

exp(n) 表示：e的n次方
log(a) 表示：ln(a)
log2() 、 log10()

In[18]: a = torch.exp(torch.ones(2,2))
In[19]: a
Out[19]: 
tensor([[2.7183, 2.7183],
        [2.7183, 2.7183]])
In[20]: torch.log(a)
Out[20]: 
tensor([[1., 1.],
        [1., 1.]])
In[22]: torch.log2(a)
Out[22]: 
tensor([[1.4427, 1.4427],
        [1.4427, 1.4427]])
In[23]: torch.log10(a)
Out[23]: 
tensor([[0.4343, 0.4343],
        [0.4343, 0.4343]])

Approximation

近似相關1

floor、ceil 向下取整、向上取整
round 4舍5入
trunc、frac 裁剪

In[24]: a = torch.tensor(3.14)
In[25]: a.floor(),a.ceil(),a.trunc(),a.frac()
Out[25]: (tensor(3.), tensor(4.), tensor(3.), tensor(0.1400))
In[26]: a = torch.tensor(3.499)
In[27]: a.round()
Out[27]: tensor(3.)
In[28]: a = torch.tensor(3.5)
In[29]: a.round()
Out[29]: tensor(4.)

clamp

近似相關2 （用的更多一些）

gradient clipping 梯度裁剪
(min) 小於min的都變為某某值
(min, max) 不在這個區間的都變為某某值
梯度爆炸：一般來說，當梯度達到100左右的時候，就已經很大了，正常在10左右，通過打印梯度的模來查看 w.grad.norm(2)
對於w的限制叫做weight clipping，對於weight gradient clipping稱為 gradient clipping。

In[30]: grad = torch.rand(2,3)*15
In[31]: grad.max()
Out[31]: tensor(10.6977)
In[32]: grad.clamp(10)		
Out[32]: 
tensor([[10.0000, 10.6977, 10.0000],
        [10.0000, 10.0000, 10.0000]])
In[33]: grad
Out[33]: 
tensor([[ 6.7738, 10.6977,  4.4314],
        [ 7.8088,  4.8236,  3.6213]])
In[34]: grad.clamp(0,10)
Out[34]: 
tensor([[ 6.7738, 10.0000,  4.4314],
        [ 7.8088,  4.8236,  3.6213]])

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 pytorch數學運算與統計屬性入門（非常易懂） shell 數學運算 Tensorflow數學運算 Python 數學運算數學運算 tensor的數學運算 shell中的數學運算 M與N的數學運算 Linux Shell 數學運算同符號數學運算