gpu使用准備

在基於docker-compose使用GPU之前，你的docker必須要能夠使用--gpus參數指定設備基於run命令啟動！
如果你遇到docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].可以自行跳轉解決！

docker-compose.yaml文件編寫

docker-compose.yaml文件我們注意有version、services、networks三個關鍵字，version用於指定代碼編寫使用的版本規則；services用於配置服務；networks用於配置網絡。
下面我列出一個測試文件：

version: "3.8"
services:
    pdf:
        image: "xxxx:xxxxx"
        user: "root"
        restart: "on-failure"
        expose:
          - "22"
          - "51002-51003"
        ports:
          - "51001:22"
          - "51002-51003:51002-51003"
        shm_size: "4g"
        networks:
          - "ana"
        container_name: "literature_pdf"
        tty: "true"
    fig:
        image: "xxxxx:xxxxx"
        user: "root"
        restart: "on-failure"
        expose:
          - "22"
          - "51009-51020"
        ports:
          - "51008:22"
          - "51009-51020:51009-51020"
        shm_size: "8g"
        volumes:
          - "/data/elfin/utils/detectron2-master:/home/appuser/detectron2-master"
        environment:
          - "NVIDIA_VISIBLE_DEVICES=all"
        deploy:
            resources:
                reservations:
                    devices:
                      - driver: "nvidia"
                        count: "all"
                        capabilities: ["gpu"]
        networks:
          - "ana"
        container_name: "fig"
        tty: "true"
    ocr:
        image: "xxxxx:xxxxx"
        user: "root"
        restart: "on-failure"
        expose:
          - "22"
          - "51005-51007"
        ports:
          - "51004:22"
          - "51005-51007:51005-51007"
        shm_size: "6g"
        deploy:
            resources:
                reservations:
                    devices:
                      - device_ids: ["1"]
                        capabilities: ["gpu"]
                        driver: "nvidia"
        networks:
          - "ana"
        container_name: "ocr"
        tty: "true"
        entrypoint: ["supervisord", "-n", "-c", "/etc/supervisor/supervisord.conf"]
networks:
    ana:
        driver: bridge

注：上面的代碼只是測試，很多地方需要優化，不是一個非常好的范本！其中，image用於指定鏡像。

注意上面實現了容器掛載、gpus使用、自定義網絡、端口映射。我感覺GPU的配置是最難的，很多時候老是會犯一些小錯誤，導致啟動后應用無法開啟。下面是關於容器的GPU依賴配置：

deploy:
    resources:
        reservations:
            devices:
                - driver: "nvidia"
                  count: "all"
                  capabilities: ["gpu"]

這里的capabilities是必須要指定的，而且count、driver、capabilities這是一組，不能每個加"-"，不然會報錯。關於GPU的其他配置可以參考官方文檔 https://docs.docker.com/compose/gpu-support/ 。

追加：下面是不錯的博客，可以參考：

https://cloud.tencent.com/developer/article/1819859

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Docker-容器使用 docker容器使用loki收集日志 docker 容器使用 systemctl 命令報錯 docker 容器使用 systemctl 命令是報錯查看 docker 容器使用的資源查看 docker 容器使用的資源 docker容器使用不同IP Docker 使用Docker-Compose編排容器 Docker Compose 多容器應用 Docker容器使用jenkins部署web項目--總結（二）