docker ------ compose設置容器使用GPU


gpu使用准備

在基於docker-compose使用GPU之前,你的docker必須要能夠使用--gpus參數指定設備基於run命令啟動!
如果你遇到docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].可以自行跳轉解決

docker-compose.yaml文件編寫

docker-compose.yaml文件我們注意有versionservicesnetworks三個關鍵字,version用於指定代碼編寫使用的版本規則;services用於配置服務;networks用於配置網絡。
下面我列出一個測試文件:

version: "3.8"
services:
    pdf:
        image: "xxxx:xxxxx"
        user: "root"
        restart: "on-failure"
        expose:
          - "22"
          - "51002-51003"
        ports:
          - "51001:22"
          - "51002-51003:51002-51003"
        shm_size: "4g"
        networks:
          - "ana"
        container_name: "literature_pdf"
        tty: "true"
    fig:
        image: "xxxxx:xxxxx"
        user: "root"
        restart: "on-failure"
        expose:
          - "22"
          - "51009-51020"
        ports:
          - "51008:22"
          - "51009-51020:51009-51020"
        shm_size: "8g"
        volumes:
          - "/data/elfin/utils/detectron2-master:/home/appuser/detectron2-master"
        environment:
          - "NVIDIA_VISIBLE_DEVICES=all"
        deploy:
            resources:
                reservations:
                    devices:
                      - driver: "nvidia"
                        count: "all"
                        capabilities: ["gpu"]
        networks:
          - "ana"
        container_name: "fig"
        tty: "true"
    ocr:
        image: "xxxxx:xxxxx"
        user: "root"
        restart: "on-failure"
        expose:
          - "22"
          - "51005-51007"
        ports:
          - "51004:22"
          - "51005-51007:51005-51007"
        shm_size: "6g"
        deploy:
            resources:
                reservations:
                    devices:
                      - device_ids: ["1"]
                        capabilities: ["gpu"]
                        driver: "nvidia"
        networks:
          - "ana"
        container_name: "ocr"
        tty: "true"
        entrypoint: ["supervisord", "-n", "-c", "/etc/supervisor/supervisord.conf"]
networks:
    ana:
        driver: bridge

注:上面的代碼只是測試,很多地方需要優化,不是一個非常好的范本!其中,image用於指定鏡像。

注意上面實現了容器掛載、gpus使用、自定義網絡、端口映射。我感覺GPU的配置是最難的,很多時候老是會犯一些小錯誤,導致啟動后應用無法開啟。下面是關於容器的GPU依賴配置:

deploy:
    resources:
        reservations:
            devices:
                - driver: "nvidia"
                  count: "all"
                  capabilities: ["gpu"]

這里的capabilities是必須要指定的,而且count、driver、capabilities這是一組,不能每個加"-",不然會報錯。關於GPU的其他配置可以參考官方文檔 https://docs.docker.com/compose/gpu-support/

追加:下面是不錯的博客,可以參考:


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM