TensorFlow-cpu優化及numpy優化

本文轉載自查看原文 2019-11-21 12:17 321 機器學習

1，TensorFlow-cpu優化

當你使用cpu版TensorFlow時（比如pip安裝），你可能會遇到警告，說你cpu支持AVX/AVX2指令集，那么在以下網址下載對應版本。

https://github.com/fo40225/tensorflow-windows-wheel

具體使用github上有說明。

根據測試，安裝AVX指令集后相應數學計算（矩陣乘法、分解等）速度是原來的3倍左右。

2，numpy優化

一般現在的numpy默認都是支持openblas的，但是我發現支持mkl的更快。下載地址

https://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy

查看numpy支持的優化：np.config.show()

以下附上測試代碼及結果，你可以在自己電腦上測試。

'''
default numpy(openblas):
---------
Dotted two 4096x4096 matrices in 1.99 s.
Dotted two vectors of length 524288 in 0.40 ms.
SVD of a 2048x1024 matrix in 1.75 s.
Cholesky decomposition of a 2048x2048 matrix in 0.21 s.
Eigendecomposition of a 2048x2048 matrix in 10.31 s.
------------------------------------------------------
numpy+mkl:
----------
Dotted two 4096x4096 matrices in 1.56 s.
Dotted two vectors of length 524288 in 0.33 ms.
SVD of a 2048x1024 matrix in 1.07 s.
Cholesky decomposition of a 2048x2048 matrix in 0.24 s.
Eigendecomposition of a 2048x2048 matrix in 6.94 s.

'''
import numpy as np
from time import time

# Let's take the randomness out of random numbers (for reproducibility)
np.random.seed(0)

size = 4096
A, B = np.random.random((size, size)), np.random.random((size, size))
C, D = np.random.random((size * 128, )), np.random.random((size * 128, ))
E = np.random.random((int(size / 2), int(size / 4)))
F = np.random.random((int(size / 2), int(size / 2)))
F = np.dot(F, F.T)
G = np.random.random((int(size / 2), int(size / 2)))

# Matrix multiplication
N = 20
t = time()
for i in range(N):
    np.dot(A, B)
delta = time() - t
print('Dotted two %dx%d matrices in %0.2f s.' % (size, size, delta / N))
del A, B

# Vector multiplication
N = 5000
t = time()
for i in range(N):
    np.dot(C, D)
delta = time() - t
print('Dotted two vectors of length %d in %0.2f ms.' %
      (size * 128, 1e3 * delta / N))
del C, D

# Singular Value Decomposition (SVD)
N = 3
t = time()
for i in range(N):
    np.linalg.svd(E, full_matrices=False)
delta = time() - t
print("SVD of a %dx%d matrix in %0.2f s." % (size / 2, size / 4, delta / N))
del E

# Cholesky Decomposition
N = 3
t = time()
for i in range(N):
    np.linalg.cholesky(F)
delta = time() - t
print("Cholesky decomposition of a %dx%d matrix in %0.2f s." %
      (size / 2, size / 2, delta / N))

# Eigendecomposition
t = time()
for i in range(N):
    np.linalg.eig(G)
delta = time() - t
print("Eigendecomposition of a %dx%d matrix in %0.2f s." %
      (size / 2, size / 2, delta / N))

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Ubuntu16.04 安裝Tensorflow-CPU tensorflow與tensorflow-cpu、tensorflow-gpu區別 Windows環境Tensorflow-cpu/gpu安裝教程在虛擬環境里安裝TensorFlow-cpu完成Ng作業 TensorFlow、numpy、matplotlib、基本操作 Linux性能優化之CPU優化(一) (tensorflow計算)如何查看tensorflow計算用的是CPU還是GPU 如何在 CPU 上優化 GEMM Oracle 優化 - CPU 問題 TensorFlow Training 優化函數

TensorFlow-cpu優化及numpy優化

1，TensorFlow-cpu優化

2，numpy優化

查看numpy支持的優化：np.__config__.show()

免責聲明！

查看numpy支持的優化：np.config.show()