Numpy 函數總結 (不斷更新)

本文轉載自查看原文 2019-01-20 05:46 1041

本篇主要收集一些平時見到的 Numpy 函數。

numpy.random.seed & numpy.random.RandomState

np.random.seed() 和 np.random.RandomState 都用於生成隨機數種子，np.random.seed() 是可以直接調用的方法，而 np.random.RandomState 則是一個產生隨機數的容器，使用時需要創建實例對象，進而調用實例方法，如 np.random.RandomState(42).uniform() 。

隨機數種子 seed 只有一次有效，在下一次調用產生隨機數函數前沒有設置 seed，則還是產生隨機數。如果需要每次都產生隨機數，則可以將隨機數seed設置成None，或者不設置。

>>> import numpy as np

>>> np.random.seed(42)
>>> np.random.randint(1, 10, 5)  # array([5, 1, 2, 6, 1])

>>> np.random.seed(42)
>>> np.random.randint(1, 10, 5)  # array([5, 1, 2, 6, 1])

>>> np.random.randint(1, 10, 5)  # array([8, 8, 3, 6, 5])

>>> from numpy.random import RandomState

>>> r = RandomState(42)
>>> r.randint(1, 10, 5)    # array([9, 9, 7, 3, 9])

>>> r = RandomState(42)
>>> r.randint(1, 10, 5)    # array([9, 9, 7, 3, 9])

>>> r = RandomState(None)
>>> r.randint(1, 10, 5)    # array([8, 3, 2, 6, 5])

>>> import random  # 使用Python的Random模塊
>>> random.seed(42)
>>> random.sample(range(10), 5)  # [1, 0, 4, 9, 6]

>>> random.sample(range(10), 5)  # [6, 9, 1, 4, 5]

numpy.tile

numpy.tile(A, n) 用於將一整個數組 A 重復 n 次。下面是一個簡單的例子：

>>> a = [1,2,3,4]
>>> np.tile(a, 3)  # array([1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4])

然而如果 n 的長度大於 1，則情況就略復雜了。下面看個例子：

>>> a = np.array([1,2,3])
>>> np.tile(a, (3, 3))

array([[1, 2, 3, 1, 2, 3, 1, 2, 3],
       [1, 2, 3, 1, 2, 3, 1, 2, 3],
       [1, 2, 3, 1, 2, 3, 1, 2, 3]])

上面的原始數組 a 為一維，n 的長度為 2，則 tile 函數會將原來的一維拓展為 2 維，再在每一維上重復相應的數組，相當於下面兩步：

>>> a = np.array([1,2,3])
>>> a = np.expand_dims(a, axis=0)
# a 為 array([[1, 2, 3]])
>>> np.tile(a, (3, 3))

上面的情況是 n 的長度大於 a 的維度，另一種情況是 n 的長度小於 a 的維度：

>>> b = np.array([[1,2,3], [4,5,6]])
>>> np.tile(b, 2)

array([[1, 2, 3, 1, 2, 3],
       [4, 5, 6, 4, 5, 6]])

上面的情況是 b 的維度為 2，n 的長度為1，則同樣 n 會被擴展為 2，不足的維度用 1 填充，即變成 (1, 2)，所以上例中 b 的第一維沒有被復制，被復制的是第二維。最后按慣例是一個復雜點的例子：

>>> c = np.arange(27).reshape((3,3,3))
array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[ 9, 10, 11],
        [12, 13, 14],
        [15, 16, 17]],

       [[18, 19, 20],
        [21, 22, 23],
        [24, 25, 26]]])

>>> np.tile(c, (2,2,2))
array([[[ 0,  1,  2,  0,  1,  2],
        [ 3,  4,  5,  3,  4,  5],
        [ 6,  7,  8,  6,  7,  8],
        [ 0,  1,  2,  0,  1,  2],
        [ 3,  4,  5,  3,  4,  5],
        [ 6,  7,  8,  6,  7,  8]],

       [[ 9, 10, 11,  9, 10, 11],
        [12, 13, 14, 12, 13, 14],
        [15, 16, 17, 15, 16, 17],
        [ 9, 10, 11,  9, 10, 11],
        [12, 13, 14, 12, 13, 14],
        [15, 16, 17, 15, 16, 17]],

       [[18, 19, 20, 18, 19, 20],
        [21, 22, 23, 21, 22, 23],
        [24, 25, 26, 24, 25, 26],
        [18, 19, 20, 18, 19, 20],
        [21, 22, 23, 21, 22, 23],
        [24, 25, 26, 24, 25, 26]],

       [[ 0,  1,  2,  0,  1,  2],
        [ 3,  4,  5,  3,  4,  5],
        [ 6,  7,  8,  6,  7,  8],
        [ 0,  1,  2,  0,  1,  2],
        [ 3,  4,  5,  3,  4,  5],
        [ 6,  7,  8,  6,  7,  8]],

       [[ 9, 10, 11,  9, 10, 11],
        [12, 13, 14, 12, 13, 14],
        [15, 16, 17, 15, 16, 17],
        [ 9, 10, 11,  9, 10, 11],
        [12, 13, 14, 12, 13, 14],
        [15, 16, 17, 15, 16, 17]],

       [[18, 19, 20, 18, 19, 20],
        [21, 22, 23, 21, 22, 23],
        [24, 25, 26, 24, 25, 26],
        [18, 19, 20, 18, 19, 20],
        [21, 22, 23, 21, 22, 23],
        [24, 25, 26, 24, 25, 26]]])

最后出來的結果其實非常具有對稱的美感。

另外與 numpy.tile() 有密切聯系的函數為 numpy.repeat() ，其功能是對應元素重復：

>>> np.repeat(13, 5)   # array([13, 13, 13, 13, 13])

numpy.repeat() 可以制定要重復的軸 (axis)，但如果不指定，則將原數組拉伸為 1 維數組后再對應元素重復：

>>> a = np.array([[1,2], [3,4]])
>>> np.repeat(a, 3)  # array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4])

>>> np.repeat(a, 3, axis=1)
array([[1, 1, 1, 2, 2, 2],
       [3, 3, 3, 4, 4, 4]])

numpy.unique

numpy.unique() 的基本用法比較簡單，用於返回數組中所有不重復的元素組成的列表 L ：

>>> a = np.array([1,2,3,1,2,3,4,5])
>>> L = np.unique(a)
array([1, 2, 3, 4, 5])

numpy.unique() 另外附帶了三個有用的附加操作，靈活運用可有奇效：

return_index 用於返回原數組中元素的索引 (index)
return_inverse 用於返回原數組元素在列表 L 中的索引 (index)
return_counts 用於返回原數組中各個不重復元素的出現次數

# 計算數組中各個元素的出現次數，並以字典形式返回，和 python 自帶的 collections.Counter() 進行效率比較
from collections import Counter
>>> a = np.random.randint(0, 10, 10000)
>>> %timeit dict(Counter(a))
1.54 ms ± 46.6 µs per loop
>>> %timeit c = dict(zip(*np.unique(a, return_counts=True)))
229 µs ± 1.76 µs per loop
>>> c
{0: 1003, 
 1: 1013,
 2: 1023,
 3: 975,
 4: 1019,
 5: 956,
 6: 979,
 7: 996,
 8: 1001,
 9: 1035}

# 將數組按不重復元素進行切分成多個子數組，返回每個子數組中元素在原數組中的索引。用於按類別分割數據，scikit-learn 中的 StratifiedShuffleSplit 和 StratifiedKFold 的實現也是基於此。
>>> a = np.random.randint(0, 3, 20)
>>> a
array([2, 2, 2, 2, 2, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0])
>>> classes, y_indices, class_counts = np.unique(labels, return_inverse=True, return_counts=True)
>>> classes, y_indices, class_counts
 array([0, 1, 2]) 
 array([2, 2, 2, 2, 2, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0]),
 array([6, 9, 5])

>>> class_indices = np.split(np.argsort(y_indices), np.cumsum(class_counts)[:-1])
>>> class_indices
[array([ 9, 16, 14, 10,  6, 19]),
 array([ 7,  8, 18, 11, 12, 13, 15, 17,  5]),
 array([4, 3, 2, 1, 0])]

np.argsort() 默認使用的是快速排序，具有不穩定的缺點；若使用歸並排序，每個子數組中元素都會保留原來的順序：

>>> class_indices = np.split(np.argsort(y_indices, kind="mergesort"), np.cumsum(class_counts)[:-1])
>>> class_indices
[array([ 6,  9, 10, 14, 16, 19]),
 array([ 5,  7,  8, 11, 12, 13, 15, 17, 18]),
 array([0, 1, 2, 3, 4])]

numpy.average

numpy.average() 和 numpy.mean() 的區別是 numpy.average() 可以計算加權平均：

>>> data = np.arange(6).reshape((3,2))
>>> data
array([[0, 1],
       [2, 3],
       [4, 5]])
>>> np.average(data, axis=1, weights=[1./4, 3./4])
array([ 0.75,  2.75,  4.75])

numpy.roll

numpy.roll(a, shift, axis=None) 用於沿着指定的 axis 滾動數組元素。若不指定 axis，則所有元素依次滾動：

>>> x = np.arange(10).reshape(2,5)
>>> x
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])
>>> np.roll(x, shift=1)
array([[9, 0, 1, 2, 3],
       [4, 5, 6, 7, 8]])
>>> np.roll(x, shift=1, axis=0)
array([[5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4]])
>>> np.roll(x, shift=1, axis=1)
array([[4, 0, 1, 2, 3],
       [9, 5, 6, 7, 8]])

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 arm匯編指令總結（不斷更新） tableau 函數（不斷更新中） git 使用經驗與技巧總結 (不斷更新中) 寫shader小細節——這個會不斷更新面試題分享【不斷更新】 nginx源碼學習資源（不斷更新） linux提權方法（不斷更新）移動端和平板適配方案總結（不斷更新中...） pycharm插件實用合集- 不斷更新 SQL Server 問題集不斷更新