Tensorflow、Pytorch、Keras的多GPU的並行操作

方法一：使用深度學習工具提供的 API指定

1.1 Tesorflow

tensroflow指定GPU的多卡並行的時候，也是可以先將聲明的變量放入GPU中（PS:這點我還是不太明白，為什么其他的框架沒有這樣做）

with tf.device("/gpu:%d"%i):

with tf.device("cpu:0")

在創建Session的時候，通過指定session的參數，便可以指定GPU的數量和使用率

ConfigProto()類提供有對GPU使用率的函數方法：

config = tf.ConfigProto() 
config.gpu_options.per_process_gpu_memory_fraction = 0.9 # 設置使用率，占用GPU90%的顯存 
session = tf.Session(config=config)

還可以指定GPU的數量是否增長

# os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"（其中0.1是選擇所調用的gpu id）
gpu_options = tf.GPUOptions(allow_growth=True
config = tf.ConfigProto(gpu_options=gpu_options) 
config.gpu_options.allow_growth = True  session = tf.Session(config=config)

個人認為最好的方式：

import os # 建議使用這種方式
import tensorflow as tf
os.environ["CUDA_VISIBLE_DEVICES"] = "2" # python 的方式指定GPU id
from keras.backend.tensorflow_backend import set_session # 如果不是使用Kears的話，可以不用寫這句話
config = tf.ConfigProto() 
config.gpu_options.per_process_gpu_memory_fraction = 0.3 # 指定GPU 利用率
set_session(tf.Session(config=config))

1.2 Pytorch

Pytorch 是目前比較方便的一種深度學習框架，在指定GPU的方法上，也是比較簡潔。PyTorch提供有torch.cuda.set_device() 方法

import torch
torch.cuda.set_device(id)

這種方式只能制定一個GPU，不太建議使用

1.3 Keras

由於Kears是作為Tesorflow或者Theano的前端出現的，從某種方式上也是可以用后端深度學習框架進行多GPU的指定的，比如 1.1小節中就是調用的kears的多GPU並行的方式，但這種方式寫的代碼不美觀

Kears本身提供有 keras.utils.multi_gpu_model(model=xxx,gpus=nums) 函數來指定使用多少GPU進行編譯

1 from keras.utils import multi_gpu_model
2 model=SegNet() # 構建model
4 parralle_model = multi_gpu_model(model,gpus=3) # 指定gpu的數量
6 parralle_model.compile(loss='mse', optimizer=optimizer, loss_weights=[1,1]) # 模型編譯

方法二：在終端指定

2.1 顯式的在終端指定

這種方法與深度學習工具無關了，無論是Keras tensorflow 還是Pytorch都是可以使用的

CUDA_VISIBLE_DEVICES=0,2,3 python your_file.py

這里的0,1,3是通過nvidia-smi 命令查看的GPU編號.

2.2 在Python代碼中指定

這種方法也是與深度學習工具無關的方法。仔細看看這種方式其實和方法二類似，關鍵字都是一樣的

import os

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # 多個GPU下使用的方法是“0,1,2,3”

總結

目前，很多深度學習框架已經把多GPU的並行操作封裝的很好了，基本上可以滿足開發者的應用需求。

需要注意的是，第二種方法在有些時候是無法起到作用的；比如我今天就遇到到這種情況: 用Kears的時候指定了多個GPU,但還是出現OOM異常，最后請教了一位厲害的程序媛小姐姐才知道Kears需要使用 keras.utils.multi_gpu_model(model=xxx,gpus=nums) 函數來指定使用多少GPU；非常感謝

15:09:15

Tips:

1. 如何實時的查看GPU 的變化？實時的查看GPU可以使用watch, 相關的命令解釋和使用方式可以使用whatis watch 和 man watch 查看

whatis watch
watch (1)            - execute a program periodically, showing output fullscreen

man watch 
WATCH(1)                                   User Commands                                   WATCH(1)

NAME
       watch - execute a program periodically, showing output fullscreen

SYNOPSIS
       watch [options] command

DESCRIPTION
       watch  runs  command  repeatedly,  displaying  its output and errors (the first screenfull).
       This allows you to watch the program output change over time.  By default,  the  program  is
       run every 2 seconds.  By default, watch will run until interrupted.

OPTIONS
....

       -n, --interval seconds
              Specify  update  interval.  The command will not allow quicker than 0.1 second inter‐
              val, in which the smaller values are converted.

....

通過man 可以知道watch -n 可以指定時間，因此可以使用

watch -n 3 nvidia-smi

同時也可以使用nvidia-smi -l

也能達到相同的效果

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 怎么查看keras 或者 tensorflow 正在使用的GPU 怎么查看keras 或者 tensorflow 正在使用的GPU [開發技巧]·TensorFlow&Keras GPU使用技巧 TensorFlow，Keras限制GPU顯存 [轉] 理解CheckPoint及其在Tensorflow & Keras & Pytorch中的使用 keras使用多GPU訓練 Pytorch以及TensorFlow的GPU版本安裝 Ubuntu18.04: GPU Driver 390.116 + CUDA9.0 + cuDNN7 + tensorflow 、Keras、Theano、pytorch環境搭建 Ubuntu 安裝 tensorflow-gpu + keras linux安裝keras+tensorflow-gpu步驟

Tensorflow、Pytorch、Keras的多GPU使用