FW docker使用問題總結,解決國內不能訪問gcr.io的問題


docker使用問題總結

解決國內不能訪問gcr.io的問題


國內可以通過https://dashboard.daocloud.io來下載。

比如?gcr.io/google_containers/pause, 可以

dao pull google/pause,

然后

docker tag google/pause ?gcr.io/google_containers/pause?
docker tag google/pause gcr.io/google_containers/pause:0.8.0?

 

 

 

重啟docker服務器后 遇到 'device or resource busy'錯誤

這是一個Docker的 bug 

 

解決方式是先找出沒有umount的路徑

cat /proc/mounts | grep "mapper/docker" | awk '{print $2}'

 

然后依次unmount

1. docker報【Error response from daemon: Error running DeviceCreate (createSnapDevice) dm_task_run failed】錯
解決辦法:
# systemctl stop docker.service
# thin_check /var/lib/docker/devicemapper/devicemapper/metadata

If there were no errors then proceed with:

# thin_check --clear-needs-check-flag /var/lib/docker/devicemapper/devicemapper/metadata
# systemctl start docker.service

If there were errors, you are on your own, but 'man thin_check' and 'man thin_repair' may be helpful...

========================================================

2. docker默認添加的iptables(ip相關的自己定制):

docker nat表部分:

docker0IP=`ifconfig docker0 |grep 'inet' | cut -d ' ' -f 10`
iptables -A POSTROUTING -t nat -s $docker0IP/30 ! -o docker0 -j MASQUERADE

DockerChain="DOCKER" 
iptables -t nat -nL $DockerChain
if [ "x$?" != "x0" ] ; then
iptables -t nat -N $DockerChain
fi
iptables -A PREROUTING -m addrtype --dst-type LOCAL -t nat -j $DockerChain
iptables -A OUTPUT -m addrtype --dst-type LOCAL -t nat -j $DockerChain ! --dst 127.0.0.0/8

參考代碼:
https://github.com/docker/docker/blob/2ad81da856c123acf91eeff7ab607376bd27d9ba/vendor/src/github.com/docker/libnetwork/drivers/bridge/setup_ip_tables.go
https://github.com/docker/docker/blob/2ad81da856c123acf91eeff7ab607376bd27d9ba/vendor/src/github.com/docker/libnetwork/iptables/iptables.go

=========================================================

3.docker報類似如下錯誤【chown socket at step GROUP: No such process】,導致啟動失敗:

# journalctl -xn
-- Logs begin at Tue 2014-12-30 13:07:53 EST, end at Tue 2014-12-30 13:25:23 EST. --
Dec 30 13:12:30 ITX kernel: ip_tables: (C) 2000-2006 Netfilter Core Team
Dec 30 13:22:53 ITX systemd[1]: Starting Cleanup of Temporary Directories...
-- Subject: Unit systemd-tmpfiles-clean.service has begun with start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit systemd-tmpfiles-clean.service has begun starting up.
Dec 30 13:22:53 ITX systemd[1]: Started Cleanup of Temporary Directories.
-- Subject: Unit systemd-tmpfiles-clean.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit systemd-tmpfiles-clean.service has finished starting up.
--
-- The start-up result is done.
Dec 30 13:25:23 ITX systemd[1]: Starting Docker Socket for the API.
-- Subject: Unit docker.socket has begun with start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker.socket has begun starting up.
Dec 30 13:25:23 ITX systemd[1868]: Failed to chown socket at step GROUP: No such process
Dec 30 13:25:23 ITX systemd[1]: docker.socket control process exited, code=exited status=216
Dec 30 13:25:23 ITX systemd[1]: Failed to listen on Docker Socket for the API.
-- Subject: Unit docker.socket has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker.socket has failed.
--
-- The result is failed.
Dec 30 13:25:23 ITX systemd[1]: Dependency failed for Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker.service has failed.
--
-- The result is dependency.
Dec 30 13:25:23 ITX systemd[1]: Unit docker.socket entered failed state.

解決辦法:

方法1.添加docker用戶組(groupadd docker,如果/etc/group用統一配置管理的話記得在源group文件中添加docker組信息)

方法2.修改/usr/lib/systemd/system/docker.socket文件:

 

[Unit]
Description=Docker Socket for the API
PartOf=docker.service

 

[Socket]
ListenStream=/var/run/docker.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker    這里改成:SocketGroup=root 或其他存在的組

 

[Install]
WantedBy=sockets.target

 

如下操作可選:

systemctl enable docker.service && systemctl enable docker.socket:

# systemctl list-unit-files | grep docker
docker.service disabled
docker.socket disabled

# chkconfig docker on #如果chkconfig不能使用則執行:systemctl enable docker.service
Note: Forwarding request to 'systemctl enable docker.service'.
ln -s '/usr/lib/systemd/system/docker.service' '/etc/systemd/system/multi-user.target.wants/docker.service'

# systemctl list-unit-files|grep docker
docker.service enabled
docker.socket disabled

# systemctl enable docker.socket
ln -s '/usr/lib/systemd/system/docker.socket' '/etc/systemd/system/sockets.target.wants/docker.socket'

# systemctl list-unit-files|grep docker
docker.service enabled
docker.socket enabled

參考鏈接:

http://www.milliondollarserver.com/?cat=7

http://www.milliondollarserver.com/?p=622

===============================================================

4.當宿主機上只有一個容器時,刪除容器有時會導致宿主機網路瞬斷

解決方法:

1.修改/etc/sysconfig/ntpd配置文件增加"-L"選項,如

cat /etc/sysconfig/ntpd

# Command line options for ntpd

OPTIONS="-g -L"

2.重啟ntpd服務:systemctl restart ntpd

參考鏈接:

https://access.redhat.com/solutions/261123

========================================================

5.docker1.6+按照官方文檔搭建的私有registry, 但是docker login的時候報錯

Username: ever
Password:
Email:
Error response from daemon: Unexpected status code [404] : <html>
<head><title>404 Not Found</title></head>
<body bgcolor="white">
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.6.3</center>
</body>
</html>

解決方法:大概說就是docker 1.6+ 需要registry 2.0, 此外還需要nginx的一個配置,而且這個配置官方文檔錯的,本來應該用set_more_header,文檔用的add_header

官方v1 image和 v2 image遷移工具,可以看一下 https://github.com/docker/migrator,推薦書籍浙大的《docker 容器和容器雲》

======================================================== 

6.docker1.8 pull鏡像服務端的訪問日志:

127.0.0.1 - - [16/Oct/2015:10:08:52 +0000] "GET /v2/ HTTP/1.1" 401 194 "-" "docker/1.8.3 go/go1.4.2 git-commit/f4bf5c7 kernel/4.2.0-1.el7.elrepo.x86_64 os/linux arch/amd64" "-"
127.0.0.1 - - [16/Oct/2015:10:08:52 +0000] "GET /v1/_ping HTTP/1.1" 404 168 "-" "docker/1.8.3 go/go1.4.2 git-commit/f4bf5c7 kernel/4.2.0-1.el7.elrepo.x86_64 os/linux arch/amd64" "-"
127.0.0.1 - - [16/Oct/2015:10:08:52 +0000] "POST /v1/users/ HTTP/1.1" 404 168 "-" "docker/1.8.3 go/go1.4.2 git-commit/f4bf5c7 kernel/4.2.0-1.el7.elrepo.x86_64 os/linux arch/amd64" "-"

docker應該訪問v2接口卻去訪問v1的接口了

解決方法:docker和registry之間通過一個header來協商api的版本

 ========================================================

7.docker容器重啟或宿主的iptables服務重啟后容器無法接收到udp數據包(Failed to receive UDP traffic):

原因:重啟容器或重啟宿主的iptables服務,在重啟過程中,因為在某個時間點,對docker服務做的nat會因為重啟失效,物理機會返回端口不可用(如:8888 port unreachable)的錯誤,這條返回會更新ip_conntrack表的緩存為類似這樣:

ipv4     2 udp      17 29 src=xx.xx.xx.xx dst=xx.xx.xx.xx sport=xxxx dport=xxxx [UNREPLIED] src=xx.xx.xx.xx dst=xx.xx.xx.xx sport=xxxx dport=xxxx mark=0 zone=0 use=2 

從而導致iptables啟動后,數據包再過來,也會依據已有的conntrack緩存,不會被轉發到docker容器里面。

解決方法:清理conntrack緩存(可以使用conntrack-tool: conntrack -F)

相關鏈接:https://github.com/docker/docker/issues/8795       清理conntrack

========================================================

8.docker宿主機新增分區(/ssd),docker必須重啟,起容器時在該分區的數據卷(-v /ssd:/ssd)才能生效

解決方法(慎用):修改/usr/lib/systemd/system/docker.service

[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=network.target docker.socket
Requires=docker.socket

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
ExecStart=/usr/bin/docker -d $OPTIONS $DOCKER_STORAGE_OPTIONS
ExecStartPost=/usr/bin/chmod 777 /var/run/docker.sock
LimitNOFILE=1048576
LimitNPROC=1048576
MountFlags=private   #將這里修改成 MountFlags=shared

[Install]
WantedBy=multi-user.target 

相關鏈接:https://huaminchen.wordpress.com/2015/05/19/how-docker-handles-mount-namespace/

========================================================

9.MFS+DOCKER的文件掛載問題

mfs在本地掛載如下
mfsmount /mnt -H ip -P port -S /
這樣本地就有一個/mnt的mfs目錄了
但是使用docker run -it -v /mnt:/mnt image:tags /bin/bash
之后發現容器內部還是本地的目錄,並不是mfs的掛載目錄。大小也不對。查看系統日志發現一個警告:
Jul 16 11:52:36 TENCENT64 docker: [error] mount.go:12 [warning]: couldn’t run auplink before unmount: exec: “auplink”: executable file not found
in $PATH

本地找不到這個auplink的命令,導致docker掛載異常,centos安裝如下:
yum install aufs-util
然后需要重啟docker
systemctl restart docker
重啟容器就可以了

到現在為止docker掛載mfs總共莫名其妙的出過兩次問題:

1.mfs修改了掛載目錄,但是沒有重啟docker,結果不論如何啟動,抓取日志,依舊沒有辦法在docker容器中看到mfs的掛載目錄。

2.在啟動進入容器之后,刪除了大量的文件,操作過程已經結束,但是mfs有回收站機制,文件沒放到了回收站,真正的數據清理其實並沒有進行。這個狀態你可以在mfs.cgi頁面可以看到。結果在容器中mkdir創建文件夾的時候報device is busy.

這兩個錯誤,我都是重啟docker之后才解決的。我認為可能是docker底層的文件服務,cgroup或者aufs有點問題。這個問題暫且留着。

其他網友總結的問題 

========================================================

10.docker v1版私有倉庫,鏡像第一次上傳時索引寫入db,但是鏡像上傳失敗(search可以找到,但是delete接口刪除失敗),倉庫報錯如下:

原因:索引已經寫入db,但是鏡像上傳失敗,此時會再次寫入索引,進而引起name不唯一的報錯

解決方法:索引存在sqlite數據庫中,去數據庫中把報錯的鏡像索引刪掉即可(sqlite3 docker-registry.db;.tables;select * from repository;)。

========================================================

11.device mapper discard的宕機。

原因:這個問題反復出現在某些服務器上,宕機重啟后通過IPMI consule進入時系統已經重新掛載了CoreDump的Kernel,看到CoreDump生成dump之前進行Recover操作和Data Copying操作,導致恢復時間很慢。通過Coredump分析屬於Kernel在DM discard方面的一個BUG,方法為禁用docker devicemapper的discard。

解決方法:設置docker啟動參數"--storage-opt dm.mountopt=nodiscard --storage-opt dm.blkdiscard=false"

========================================================

12.docker啟動報錯[error] attach_loopback.go:42 There are no more loopback devices available,完整錯誤日志:

systemd[1]: Starting Docker Application Container Engine...
docker[47518]: 2016/02/03 14:50:32 docker daemon: 1.3.2 39fa2fa/1.3.2; execdriver: native; graphdriver:
docker[47518]: [b98612a1] +job serveapi(fd://, tcp://0.0.0.0:2375, unix:///var/run/docker.sock)
docker[47518]: [error] attach_loopback.go:42 There are no more loopback devices available.
docker[47518]: 2016/02/03 14:50:32 loopback mounting failed
systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
systemd[1]: Failed to start Docker Application Container Engine.
systemd[1]: Unit docker.service entered failed state.
systemd[1]: docker.service failed.

原因:because your host system does not have the loopback files in it's dev for docker to use.

解決方法:Use something like this on your host then run the container and it will pick up the devices.

#!/bin/bash

for in {0..6}

do

    mknod -m0660 /dev/loop$i b 7 $i

done

docker 官方issue:git issue

 

=========================其他鏈接================================

 

Linux內核bug引起Mesos、Kubernetes、Docker的TCP/IP數據包失效

docker容器根目錄為只讀的解決辦法


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM