SkyWalking鏈路追蹤系統-部署篇


1、概述

1.1 介紹

在分布式架構、微服務以及k8s生態相關技術環境下,對應用的請求鏈路進行追蹤(也叫做APMApplication Performance Management)是非常有必要的,鏈路追蹤簡單來說就是將應用從流量到達前端開始,一直到最后端的數據庫核心,中間經過的每一層請求鏈路的完整行為都記錄下來,而且通過可視化的形式實現鏈路信息查詢、依賴關系、性能分析、拓撲展示等等,利用鏈路追蹤系統可以很好的幫我們定位問題,這是常規監控手段實現起來比較困難的

常用的鏈路追蹤系統有商業版本和開源版本,比較出名(我了解過的)的有如下:

  • 商業版本
    • 聽雲
    • 博睿宏遠
  • 開源版本
    • Skywalking:中國,個人開源,目前隸屬於Apache基金會,作者近期剛剛入選Apache首位中國董事
    • Pinpoint:韓國,個人開源
    • Zipkin:美國,Twitter公司開源
    • Cat:中國,美團開源

具體每一款鏈路追蹤系統的的詳細信息可以在網上找到,其中商業版本這里不做評價

開源版本中后兩款對業務代碼有侵入性,前兩款的對比可以參考下圖

圖片地址:https://skywalking.apache.org/zh/2019-02-24-skywalking-pk-pinpoint/0081Kckwly1gkl4kjo1okj30in0q3gnb.jpg

1.2 組件

本文采用的是SkyWalking,簡單來說分為以下幾個組成部分(以本文中的部署方式划分)

  • skywalking-oap-server:后端服務
  • skywalking-ui:ui前端
  • skywalking-es-init:初始化es集群數據使用
  • elasticsearch:存儲skywalking的數據指標

2、基礎准備

2.1 准備helm環境

helm3版本只需要一個二進制包即可,我這里的版本如下

# helm version
version.BuildInfo{Version:"v3.5.2", GitCommit:"167aac70832d3a384f65f9745335e9fb40169dc2", GitTreeState:"dirty", GoVersion:"go1.15.7"}

2.2 創建單獨的ns

skywalking部署在單獨的命名空間下

# kubectl create ns monitoring
namespace/monitoring created

2.3 創建secret

這里記錄的是在內網環境下部署的skywalking,本地電腦為helm部署客戶端可以訪問外網,k8s集群無外網。因此需要將skywalking用到的鏡像全部由內網環境私有鏡像倉庫提供

2.3.1 拉取鏡像的secret

# kubectl create secret docker-registry registry-pull-secret --docker-username=admin --docker-password=123456 --docker-email=admin@admin.com --docker-server=hub.ssgeek.com -n monitoring
secret/registry-pull-secret created

2.3.2 用於https安全訪問的secret

可選步驟,我的集群中有cert-manager自動頒發證書,提供給skywalking uiingress使用,對應需要修改后面的chart包相關部分

# cat certificate.yaml
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: skywalking
  namespace: monitoring
spec:
  secretName: skywalking
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  duration: 2160h
  renewBefore: 360h
  keyEncoding: pkcs1
  dnsNames:
  - skywalking.ssgeek.com
# kubectl apply -f certificate.yaml
certificate.cert-manager.io/skywalking created
# kubectl get certificate,secret -n monitoring|grep skywalking
certificate.cert-manager.io/skywalking   True    skywalking   2m50s
secret/skywalking            kubernetes.io/tls                     3      2m49s

2.3.3 用於skywalking ui訪問控制的secret

skywalkingui界面默認沒有訪問控制,可以通過下面基於Nginx Ingressbasic auth方案,也可以使用我之前文章中記錄的基於k8s Ingress Nginx+OAuth2+Gitlab無代碼侵入實現自定義服務的外部驗證

畫重點:這里使用basic有個小坑,參考官方文檔經過測試,在創建secret之前通過htpasswd工具生成的記錄用戶名密碼的文件的文件名,必須叫auth,不然經過后續的一頓操作,最終訪問的結果還是503,這與傳統方式配置nginxbasic auth是不同的,可能在源碼中將此參數硬編碼了,具體原因沒有深究

# htpasswd -c auth skywalking
New password: 
Re-type new password: 
Adding password for user skywalking
# kubectl -n monitoring create secret generic ui-auth --from-file=auth
secret/ui-auth created

2.4 私有倉庫鏡像存儲

把部署涉及到的相關鏡像存儲到內部倉庫,部署的是目前最新版本的skywalking

apache/skywalking-ui:8.4.0
hub.ssgeek.com/skywalking/skywalking-ui:8.4.0

apache/skywalking-oap-server:8.4.0-es7
hub.ssgeek.com/skywalking/skywalking-oap-server:8.4.0-es7

busybox:1.30
hub.ssgeek.com/skywalking/busybox:1.30

docker.elastic.co/elasticsearch/elasticsearch:7.5.2
hub.ssgeek.com/skywalking/elasticsearch:7.5.2

3、獲取chart並更新依賴和value相關參數

獲取官方最新的chart,並更新chart依賴,更新依賴會自動下載一個子chart包,也就是elasticsearch的官方chart,下載的包不用解壓更改,所有參數都通過父chartvalue.yaml全局指定

# git clone https://github.com/apache/skywalking-kubernetes.git
# cd skywalking-kubernetes/chart
# helm dep up skywalking
Hang tight while we grab the latest from your chart repositories...
Update Complete. ⎈Happy Helming!⎈
Saving 1 charts
Downloading elasticsearch from repo https://helm.elastic.co/
Deleting outdated charts

修改value.yaml,下面的內容中只列出了我修改后的部分內容,其中關於elasticsearch還有很多參數及優化可供配置,這里僅使用精簡配置,更多內容可以參考官方的說明

...
imagePullSecrets:
  - name: registry-pull-secret

initContainer:
  image: hub.ssgeek.com/skywalking/busybox
  tag: '1.30'

oap:
  name: oap
  # When 'dynamicConfigEnabled' set to true, enable oap dynamic configuration through k8s configmap,
  # Note: The default configmap data is empty, please refer to the detailed documentation (https://github.com/apache/skywalking/blob/master/docs/en/setup/backend/dynamic-config.md)
  # Sync period in seconds. Defaults to 60 seconds. env: SW_CONFIG_CONFIGMAP_PERIOD
  dynamicConfigEnabled: false
  image:
    repository: hub.ssgeek.com/skywalking/skywalking-oap-server
    tag: 8.4.0-es7  # Must be set explicitly
    pullPolicy: IfNotPresent
  storageType: elasticsearch7 # 存儲類型為es7
...
  tolerations: []
  resources:
     limits:
       cpu: 2
       memory: 4Gi
     requests:
       cpu: 1
       memory: 1Gi
...
  env:
    # more env, please refer to https://hub.docker.com/r/apache/skywalking-oap-server
    # or https://github.com/apache/skywalking-docker/blob/master/6/6.4/oap/README.md#sw_telemetry
    SW_NAMESPACE: "skywalking" # 指定es索引前綴為skywalking_, 其中下划線_會自動加上
...
ui:
  name: ui
  replicas: 1
  image:
    repository: hub.ssgeek.com/skywalking/skywalking-ui
    tag: 8.4.0  # Must be set explicitly
    pullPolicy: IfNotPresent
  # podAnnotations:
  #   example: oap-foo
  nodeAffinity: {}
  nodeSelector: {}
  tolerations: []
  ingress:
    enabled: true
    annotations:
       kubernetes.io/ingress.class: nginx
       # 指定basic auth相關注解
       nginx.ingress.kubernetes.io/auth-type: basic
       nginx.ingress.kubernetes.io/auth-secret: ui-auth
       nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required'
    path: /
    hosts:
     - skywalking.ssgeek.com
    tls:
      - secretName: skywalking
        hosts:
          - skywalking.ssgeek.com
...
elasticsearch:
  enabled: true
  config:               # For users of an existing elasticsearch cluster,takes effect when `elasticsearch.enabled` is false
    port:
      http: 9200
#    host: elasticsearch # es service on kubernetes or host
    host: elasticsearch-logging.logging.svc
    user: "elastic"         # [optional]
    password: "elastic"     # [optional]
  clusterName: "elasticsearch"
  nodeGroup: "logging"

  # The service that non master groups will try to connect to when joining the cluster
  # This should be set to clusterName + "-" + nodeGroup for your master group
  masterService: "elasticsearch-logging"
...
  image: "hub.ssgeek.com/skywalking/elasticsearch"
  imageTag: "7.5.2"
  imagePullPolicy: "IfNotPresent"
...
  resources:
    requests:
      cpu: "100m"
      memory: "1Gi"
    limits:
      cpu: "1000m"
      memory: "2Gi"
...
  volumeClaimTemplate:
    accessModes: [ "ReadWriteOnce" ]
    storageClassName: "ceph-rbd"
    resources:
      requests:
        storage: 30Gi
...
  persistence:
    enabled: true
    annotations: {}
...
  imagePullSecrets:
    - name: registry-pull-secret

4、helm安裝skywalking

前面的准備工作都做完后,就可以開始通過helm一鍵部署skywalking

# helm install skywalking skywalking -n monitoring --values ./skywalking/values.yaml
NAME: skywalking
LAST DEPLOYED: Thu Mar 18 18:45:03 2021
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
NOTES:
************************************************************************
*                                                                      *
*                 SkyWalking Helm Chart by SkyWalking Team             *
*                                                                      *
************************************************************************

Thank you for installing skywalking.

Your release is named skywalking.

Learn more, please visit https://skywalking.apache.org/

Get the UI URL by running these commands:
  https://skywalking.ssgeek.com/

5、檢查

觀察pod日志,直到出現create instance_jvm_thread_peak_count index template finished

2021-03-18 10:48:32,242 - org.apache.skywalking.oap.server.core.storage.model.ModelInstaller -139765 [main] INFO  [] - table: instance_jvm_thread_peak_count does not exist
2021-03-18 10:48:32,243 - org.apache.skywalking.oap.server.storage.plugin.elasticsearch.base.StorageEsInstaller -139766 [main] INFO  [] - index skywalking_instance_jvm_thread_peak_count's columnTypeEsMapping builder str: {properties={service_id={type=keyword}, count={index=false, type=long}, time_bucket={type=long}, entity_id={type=keyword}, value={type=long}, summation={index=false, type=long}}}
2021-03-18 10:48:32,614 - org.apache.skywalking.oap.server.storage.plugin.elasticsearch.base.StorageEsInstaller -140137 [main] INFO  [] - create instance_jvm_thread_peak_count index template finished, isAcknowledged: true
2021-03-18 10:48:33,319 - org.apache.skywalking.oap.server.storage.plugin.elasticsearch.base.StorageEsInstaller -140842 [main] INFO  [] - create instance_jvm_thread_peak_count-20210318 index finished, isAcknowledged: true
......
2021-03-18 10:48:33,583 - org.eclipse.jetty.server.handler.ContextHandler -141106 [main] INFO  [] - Started o.e.j.s.ServletContextHandler@12e4822b{/,null,AVAILABLE}
2021-03-18 10:48:33,597 - org.eclipse.jetty.server.AbstractConnector -141120 [main] INFO  [] - Started ServerConnector@5cc9d3d0{HTTP/1.1, (http/1.1)}{0.0.0.0:12800}
2021-03-18 10:48:33,597 - org.eclipse.jetty.server.Server -141120 [main] INFO  [] - Started @141185ms
2021-03-18 10:48:33,599 - org.apache.skywalking.oap.server.core.storage.PersistenceTimer -141122 [main] INFO  [] - persistence timer start
2021-03-18 10:48:33,603 - org.apache.skywalking.oap.server.core.cache.CacheUpdateTimer -141126 [main] INFO  [] - Cache updateServiceInventory timer start
2021-03-18 10:48:41,499 - org.apache.skywalking.oap.server.starter.OAPServerBootstrap -149022 [main] INFO  [] - OAP starts up in init mode successfully, exit now...

查看pod狀態

# kubectl -n monitoring get pods                         
NAME                              READY   STATUS      RESTARTS   AGE
elasticsearch-logging-0           1/1     Running     0          5m54s
elasticsearch-logging-1           1/1     Running     0          5m53s
elasticsearch-logging-2           1/1     Running     0          5m53s
skywalking-es-init-t7ndj          0/1     Completed   0          5m54s
skywalking-oap-57d7f454f5-8gbh5   1/1     Running     2          5m54s
skywalking-oap-57d7f454f5-vqh2d   1/1     Running     2          5m54s
skywalking-ui-698cdb4dbc-xxktt    1/1     Running     0          5m54s

訪問web ui,通過界面訪問並輸入basic auth設置的用戶名和密碼后,成功訪問到skywalking的主界面

到這里,基於k8s+helm在內網環境下部署的skywalking服務端就結束了,如果是完全沒有內網的環境,可以把前面修改完成后的chart包打包上傳到私有helm倉庫例如harbor中,這樣chart+image都是內網,部署時就完全不需要外網了

后面會繼續實踐后並分享采集端的接入以及具體使用,歡迎催更~ ☺

更多參考


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM