離線安裝ocp3.11需要注意的事情


 

檢查階段

運行部署前檢查的時候

# ansible-playbook -vv playbooks/prerequisites.yml

需要看看play recap是否全過,如果不過需要定位原因,反復執行

之前在檢查階段,因為node1,node2經常連接不上master(設置為yum源)的repo/base,也就是RHEL7.6的包,暫時解決辦法是在repo中分別掛在自己本地的源繞開錯誤。 

 

 

 部署階段

# ansible-playbook -vv /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml

安裝完成后的提示,如果有不成功,解決問題以后反復執行。

 

檢查安裝

[root@master yum.repos.d]# oc login -u system:admin
Logged into "https://master.example.com:8443" as "system:admin" using existing credentials.

You have access to the following projects and can switch between them with 'oc project <projectname>':

  * default
    kube-public
    kube-system
    management-infra
    openshift
    openshift-console
    openshift-infra
    openshift-logging
    openshift-metrics-server
    openshift-monitoring
    openshift-node
    openshift-sdn
    openshift-web-console

Using project "default".
[root@master yum.repos.d]# oc get nodes
NAME                 STATUS    ROLES     AGE       VERSION
master.example.com   Ready     master    23m       v1.11.0+d4cacc0
node1.example.com    Ready     infra     18m       v1.11.0+d4cacc0
node2.example.com    Ready     compute   18m       v1.11.0+d4cacc0

 

[root@master yum.repos.d]# oc get pods --all-namespaces
NAMESPACE                  NAME                                           READY     STATUS         RESTARTS   AGE
default                    docker-registry-1-9q962                        1/1       Running        0          17m
default                    registry-console-1-4mb7d                       1/1       Running        0          17m
default                    router-1-74pr6                                 1/1       Running        0          17m
kube-system                master-api-master.example.com                  1/1       Running        0          22m
kube-system                master-controllers-master.example.com          1/1       Running        1          22m
kube-system                master-etcd-master.example.com                 1/1       Running        0          22m
openshift-console          console-5896bbb547-df6p2                       1/1       Running        0          15m
openshift-infra            hawkular-cassandra-1-k5bg2                     1/1       Running        0          12m
openshift-infra            hawkular-metrics-6ldrw                         0/1       Pending        0          6m
openshift-infra            hawkular-metrics-858mh                         0/1       Preempting     0          12m
openshift-infra            hawkular-metrics-schema-sd7c5                  0/1       Completed      0          13m
openshift-infra            heapster-tvn6t                                 1/1       Running        0          12m
openshift-logging          logging-es-data-master-4g5tbuou-1-bcnsx        0/2       Pending        0          5m
openshift-logging          logging-es-data-master-4g5tbuou-1-deploy       1/1       Running        0          5m
openshift-logging          logging-fluentd-m5rbg                          1/1       Running        0          6m
openshift-logging          logging-fluentd-m64sn                          1/1       Running        0          6m
openshift-logging          logging-fluentd-nqpz4                          1/1       Running        0          6m
openshift-logging          logging-kibana-1-wpf2t                         2/2       Running        0          7m
openshift-metrics-server   metrics-server-845b478887-vcbkd                0/1       ErrImagePull   0          11m
openshift-monitoring       alertmanager-main-0                            3/3       Running        0          14m
openshift-monitoring       alertmanager-main-1                            3/3       Running        0          14m
openshift-monitoring       alertmanager-main-2                            3/3       Running        0          14m
openshift-monitoring       cluster-monitoring-operator-674969789d-65rxn   1/1       Running        0          16m
openshift-monitoring       grafana-7594d8dd75-cwr6p                       2/2       Running        0          15m
openshift-monitoring       kube-state-metrics-787f69cf4d-xjh76            3/3       Running        0          14m
openshift-monitoring       node-exporter-bwvpv                            2/2       Running        0          14m
openshift-monitoring       node-exporter-hzbb8                            2/2       Running        0          14m
openshift-monitoring       node-exporter-rdzlp                            2/2       Running        0          14m
openshift-monitoring       prometheus-k8s-0                               4/4       Running        1          15m
openshift-monitoring       prometheus-k8s-1                               4/4       Running        1          15m
openshift-monitoring       prometheus-operator-8544897d54-z7249           1/1       Running        0          16m
openshift-node             sync-6xthq                                     1/1       Running        0          20m
openshift-node             sync-rsgz9                                     1/1       Running        0          19m
openshift-node             sync-vsbws                                     1/1       Running        0          19m
openshift-sdn              ovs-5d2dl                                      1/1       Running        0          20m
openshift-sdn              ovs-gd4gw                                      1/1       Running        0          19m
openshift-sdn              ovs-ktpt6                                      1/1       Running        0          19m
openshift-sdn              sdn-dz8kv                                      1/1       Running        0          19m
openshift-sdn              sdn-mhbkg                                      1/1       Running        0          19m
openshift-sdn              sdn-x7tq9                                      1/1       Running        0          20m
openshift-web-console      webconsole-5db89b6cd4-5sm9d                    1/1       Running        2          16m

metrics還出不來

在master節點執行創建admin用戶

# htpasswd /etc/origin/master/htpasswd admin

同時賦予admin用戶權限

# oc adm policy add-cluster-role-to-user cluster-admin admin

 

在hosts文件中加入

192.168.0.103 master.example.com
192.168.0.104 console.apps.example.com
192.168.0.104 prometheus-k8s-openshift-monitoring.apps.example.com
192.168.0.104 grafana-openshift-monitoring.apps.example.com
192.168.0.104 hawkular-metrics.apps.example.com

 

 訪問https://master.example.com:8443,轉到cluster console下,可以訪問到集群相關的監控信息

 

 

修改錯誤

  • Metrics

經過定位,metrics啟動不了的原因主要是兩點:

1.ose-metrics-server的鏡像缺失,這個重新導入后解決

2.openshift-monitoring下的node2下的node-exporter-sbddr一直啟動出錯,經過定位發現是安裝了一個gitlab軟件造成的端口沖突問題,把gitlab停掉后啟動成功

[root@master ~]# oc get pods  -n openshift-monitoring -o wide
NAME                                           READY     STATUS    RESTARTS   AGE       IP              NODE                 NOMINATED NODE
alertmanager-main-0                            3/3       Running   23         21h       10.129.0.69     node1.example.com    <none>
alertmanager-main-1                            3/3       Running   20         21h       10.129.0.66     node1.example.com    <none>
alertmanager-main-2                            3/3       Running   20         21h       10.129.0.68     node1.example.com    <none>
cluster-monitoring-operator-674969789d-65rxn   1/1       Running   10         21h       10.129.0.65     node1.example.com    <none>
grafana-7594d8dd75-cwr6p                       2/2       Running   18         21h       10.129.0.64     node1.example.com    <none>
kube-state-metrics-787f69cf4d-xjh76            3/3       Running   20         21h       10.129.0.71     node1.example.com    <none>
node-exporter-bwvpv                            2/2       Running   8          21h       192.168.0.104   node1.example.com    <none>
node-exporter-hzbb8                            2/2       Running   14         21h       192.168.0.103   master.example.com   <none>
node-exporter-sbddr                            2/2       Running   0          13m       192.168.0.105   node2.example.com    <none>
prometheus-k8s-0                               4/4       Running   22         21h       10.129.0.70     node1.example.com    <none>
prometheus-k8s-1                               4/4       Running   22         21h       10.129.0.67     node1.example.com    <none>
prometheus-operator-8544897d54-z7249           1/1       Running   8          21h       10.129.0.63     node1.example.com    <none>

3.openshift-infra下面的hawkular-metrics-9r5nc pod一直在pending狀態,describe一下發現需要1.5G的內存,修改rc hawkular-metrics request為500m,后啟動成功

[root@master ~]# oc get pods -n openshift-infra -o wide
NAME                            READY     STATUS      RESTARTS   AGE       IP            NODE                 NOMINATED NODE
hawkular-cassandra-1-k5bg2      1/1       Running     4          21h       10.130.0.42   node2.example.com    <none>
hawkular-metrics-9r5nc          1/1       Running     0          11m       10.129.0.75   node1.example.com    <none>
hawkular-metrics-schema-sd7c5   0/1       Completed   0          21h       10.130.0.3    node2.example.com    <none>
heapster-tvn6t                  1/1       Running     39         21h       10.128.0.53   master.example.com   <none>

終於也能截圖展示一下了。

 

  •  EFK

經過定位主要是內存不夠問題導致,所以現有的16G機器無法折騰了,看了pod啟動命令,一個啟動起來居然就要8G.令人發指啊!

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM