一、 准备材料
离线安装包:spark-operator-install.zip
网盘:https://cloud.189.cn/t/6FJjiuFZFviy (访问码:n1ct)
二、 环境规划
本次部署依赖于K8s1.21.2集群,具体环境搭建参考《离线搭建K8s1.21.2集群》
K8s版本:1.21.2
Spark版本:3.0.0
Spark-operator版本:latest
三、 部署过程
# 解压spark-operator-install.zip包,进入解压后的文件夹
# jmx_prometheus_javaagent-0.11.0.jar、spark-3.0.0-gcs-prometheus.tar可不理会,是为后面监控做准备的,本文暂不使用
# 加载spark-operator镜像(所有节点均需要加载)
[root@k8s-master spark-operator-install]# docker load < spark-operator-latest.tar
# 加载spark3.0.0镜像(所有节点均需要加载)
[root@k8s-master spark-operator-install]# docker load < spark-base-t1-1129.tar
# 解压并进入spark-on-k8s-operator-master.zip
[root@k8s-master spark-operator-install]# unzip spark-on-k8s-operator-master.zip
# 查看manifest目录结构,后面需要使用
# 编辑manifest/spark-operator-install/spark-operator.yaml文件,主要修改image、imagePullPolicy两个属性,改为自己的spark-operator镜像,imagePullPolicy改为只使用本地镜像,如下
[root@k8s-master spark-on-k8s-operator-master]# vim manifest/spark-operator-install/spark-operator.yaml
# 依次执行如下命令
[root@k8s-master spark-on-k8s-operator-master]# kubectl apply -k manifest/crds/
[root@k8s-master spark-on-k8s-operator-master]# kubectl apply -k manifest/spark-application-rbac/
[root@k8s-master spark-on-k8s-operator-master]# kubectl apply -k manifest/spark-operator-install/
# 执行后效果
#如果要回退,可以使用命令:kubectl delete -k manifest/spark-operator-install/
# 查看容器是否跑起来
[root@k8s-master spark-on-k8s-operator-master]# kubectl get pods -n spark-operator
# 第二种方式查看是否成功,去可视化界面选择明明空间spark-operator,选择pods,是否出现新的容器,再查看容器日志,没有报错,即成功
四、 运行示例
# 修改examples/spark-pi.yaml中的容器镜像为我们之前创建的spark镜像
# 如下所示, 主要修改spec.image 和 imagePullPolicy
# 其中,需要注意namespace和serviceAccount的对应关系,如果运行不成功,大概率是这两个导致的权限问题,
#本次namespace: spark-operator、serviceAccountName: sparkoperator,可以去manifest/spark-operator-install/spark-operator.yaml里查看
[root@k8s-master spark-on-k8s-operator-master]# vim examples/spark-pi.yaml
# 完整内容如下:
# ======↓↓↓↓↓↓======
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: spark-pi
namespace: spark-operator
spec:
type: Scala
mode: cluster
image: "spark-base-t1:v3.0.0"
imagePullPolicy: Never
mainClass: org.apache.spark.examples.JavaSparkPi
mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.12-3.0.0.jar"
sparkVersion: "3.0.0"
restartPolicy:
type: Never
volumes:
- name: "test-volume"
hostPath:
path: "/tmp"
type: Directory
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
labels:
version: 3.0.0
serviceAccount: sparkoperator
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
executor:
cores: 1
instances: 1
memory: "512m"
labels:
version: 3.0.0
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
# ======↑↑↑↑↑↑======
# 然后运行kubectl执行创建任务
[root@k8s-master spark-on-k8s-operator-master]# kubectl apply -f examples/spark-pi.yaml
# 查看运行容器
[root@k8s-master spark-on-k8s-operator-master]# kubectl get pods -n spark-operator
#可以看到spark-pi-driver,状态Completed,即运行完成了
#或者可以去浏览器看
#至此,部署结束,至于spark的指标监控,有时间再出教程