Building basic functions with numpy
math.exp()
import math def basic_sigmoid(x): s = 1 / (1+ math.exp(-x)) return s
>>> basic_sigmoid(3) 0.9525741268224334
>>> x = [1,2,3] >>> basic_sigmoid(x) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in basic_sigmoid
但是實際上在深度學習中數據往往是矩陣和向量形式,而math庫函數的參數往往是一個實數,因而在深度學習中廣泛使用numpy庫。
numpy基礎
>>> import numpy as np >>> x = np.array([1,2,3]) >>> print(np.exp(x))
[ 2.71828183 7.3890561 20.08553692]
>>> import numpy as np >>> x = np.array([1,2,3]) >>> print(x+3)
[4 5 6]
numpy實現sigmoid函數:
import numpy as np def sigmoid(x): s = 1 / (1 + np.exp(x)) return s
x = np.array([1, 2, 3]) sigmoid(x)
array([ 0.26894142, 0.11920292, 0.04742587])
求sigmoid函數的梯度

import numpy as np def sigmoid_derivative(x): s = 1 / (1 + np.exp(-x)) ds = s * (1-s) return ds
x = np.array([1,2,3]) print("sigmoid_derivative(x)=" + str(sigmoid_derivative(x)))
sigmoid_derivative(x)=[ 0.19661193 0.10499359 0.04517666]
Reshaping arrays
深度學習中常用的numpy函數:np.shape,np.reshape
X.shape 用來獲得矩陣或者是向量X 的維度;
X.reshape用來重新調整X的維度。
圖像一般用三維矩陣表示(length,height,depth),但是,算法中往往將矩陣調整到一個向量輸入,即調整成:(length*height*3,1)。
如果想將一個輸入為(a,b,c)的矩陣調整到(a*b,c):
v = v.reshape((v.shape[0]*v.shape[1],v.shape[2]))
將三維圖像矩陣調整為向量:
def image2vector(image): v = image.reshape((image.shape[0]*image.shape[1]*image.shape[2],1)) return v
# This is a 3 by 3 by 2 array, typically images will be (num_px_x, num_px_y,3) where 3 represents the RGB values
image = np.array([[[ 0.67826139, 0.29380381], [ 0.90714982, 0.52835647], [ 0.4215251 , 0.45017551]], [[ 0.92814219, 0.96677647], [ 0.85304703, 0.52351845], [ 0.19981397, 0.27417313]], [[ 0.60659855, 0.00533165], [ 0.10820313, 0.49978937], [ 0.34144279, 0.94630077]]])
print ("image2vector(image) = " + str(image2vector(image)))
image2vector(image) = [[ 0.67826139] [ 0.29380381] [ 0.90714982] [ 0.52835647] [ 0.4215251 ] [ 0.45017551] [ 0.92814219] [ 0.96677647] [ 0.85304703] [ 0.52351845] [ 0.19981397] [ 0.27417313] [ 0.60659855] [ 0.00533165] [ 0.10820313] [ 0.49978937] [ 0.34144279] [ 0.94630077]]
Normalizing rows
機器學習中常用歸一化的方法來處理數據,歸一化后梯度下降將會效率變高。歸一化的形式例如:

import numpy as np def normalizeRows(x): x_norm = np.linalg.norm(x,ord = 2,axis = 1,keepdims = True) x = x / x_norm return x
x = np.array([[0,3,4],[1,6,4]]) print("normalizeRows(x) = " + str(normalizeRows(x)))
normalizeRows(x) = [[ 0. 0.6 0.8 ] [ 0.13736056 0.82416338 0.54944226]]
Broadcasting and the softmax function

import numpy as np def softmax(x): x_exp = np.exp(x) x_sum = np.sum(x_exp,axis = 1, keepdims = True) s = x / x_sum return s
x = np.array([ [9, 2, 5, 0, 0], [7, 5, 0, 0 ,0]]) print("softmax(x) = " + str(softmax(x)))
softmax(x) = [[ 0.00108947 0.0002421 0.00060526 0. 0. ] [ 0.00560877 0.00400626 0. 0. 0. ]]
Vectorization
在深度學習中向量化是提高計算效率的有效手段。
一維向量點積運算
對於一維向量的內積運算,其結果是一個數。
非向量化一維向量內積運算:
import time x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0] x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0] tic = time.process_time() dot = 0 for i in range(len(x1)): dot += x1[i] * x2[i] toc = time.process_time() print("dot = " + str(dot) + "\n ----- Computation time = " + str(1000 * (toc - tic)) + "ms")
向量化點積運算:
import numpy as np import time x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0] x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0] tic = time.process_time() dot = np.dot(x1,x2) toc = time.process_time() print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
outer運算
①對於多維向量,全部展開變為一維向量
②第一個參數表示倍數,使得第二個向量每次變為幾倍。
③第一個參數確定結果的行,第二個參數確定結果的列
非向量化實現outer:
import numpy as np import time x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0] x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0] tic = time.process_time() outer = np.zeros((len(x1),len(x2))) for i in range(len(x1)): for j in range(len(x2)): outer[i,j]=x1[i]*x2[j] toc = time.process_time() print ("outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
向量化實現outer:
import numpy as np import time x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0] x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0] tic = time.process_time() outer = np.outer(x1,x2) toc = time.process_time() print ("outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
對應位置元素計算
非向量化實現元素對應相乘:
import numpy as np import time x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0] x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0] tic = time.process_time() mul = np.zeros((len(x1))) for i in range(len(x1)): mul[i]=x1[i]*x2[i] toc = time.process_time() print ("elementwise multiplication = " + str(mul) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
向量化實現元素對應相乘:
import numpy as np import time x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0] x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0] tic = time.process_time() mul = np.multiply(x1,x2) toc = time.process_time() print ("elementwise multiplication = " + str(mul) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
矩陣相乘
非向量化的矩陣點乘:
import numpy as np import time x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0] x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0] W = np.random.rand(3,len(x1)) tic = time.process_time() gdot = np.zeros(W.shape[0]) for i in range(W.shape[0]): for j in range(len(x1)): gdot[i] = W[i,j]*x1[j] toc = time.process_time() print ("gdot = " + str(gdot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
向量化的矩陣點乘:
import numpy as np import time x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0] x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0] W = np.random.rand(3,len(x1)) # Random 3*len(x1) numpy array
tic = time.process_time() dot = np.dot(W,x1) toc = time.process_time() print ("gdot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
L1、L2計算
L1計算
import numpy as np import time def L1(yhat,y): loss = np.sum(np.abs(yhat-y)) return loss
yhat = np.array([.9, 0.2, 0.1, .4, .9]) y = np.array([1, 0, 0, 1, 1]) print("L1 = " + str(L1(yhat,y)))
L1 = 1.1
L2計算
import numpy as np import time def L2(yhat,y): loss = np.dot((yhat-y),(yhat-y)) return loss
yhat = np.array([.9, 0.2, 0.1, .4, .9]) y = np.array([1, 0, 0, 1, 1]) print("L2 = " + str(L2(yhat,y)))
L2 = 0.43
