Spark 編程讀取hive,hbase, 文本等外部數據生成dataframe后,一般我們都會map遍歷get數據的每個字段,此時如果原始數據為null時,如果不進行判斷直接轉化為string,就會報空指針異常 java.lang.NullPointerException
示例代碼如下:
val data = spark.sql(sql)
val rdd = data.rdd.map(record => {
val recordSize = record.size
for(i <- 0 to (recordSize-1)){
val str = record.get(i).toString
do something...
}
為了解決該問題,可以對代碼添加判空邏輯,如下所示:
val data = spark.sql(sql)
val rdd = data.rdd.map(record => {
val recordSize = record.size
for(i <- 0 to (recordSize-1)){
val str = record.get(i)
if(!record.isNullAt(i) && !str.toString.isEmpty){
do something...
}
}
record.isNullAt(i) 判斷第i個字段取值是否為null
不為null的話,再用isEmpty判斷是否為空