一 問題
hivesql可以正常運行,spark3.0運行報錯如圖

spark3.0配置 查看源碼新增一個
val STORE_ASSIGNMENT_POLICY = buildConf("spark.sql.storeAssignmentPolicy") .doc("When inserting a value into a column with different data type, Spark will perform " + "type coercion. Currently, we support 3 policies for the type coercion rules: ANSI, " + "legacy and strict. With ANSI policy, Spark performs the type coercion as per ANSI SQL. " + "In practice, the behavior is mostly the same as PostgreSQL. " + "It disallows certain unreasonable type conversions such as converting " + "`string` to `int` or `double` to `boolean`. " + "With legacy policy, Spark allows the type coercion as long as it is a valid `Cast`, " + "which is very loose. e.g. converting `string` to `int` or `double` to `boolean` is " + "allowed. It is also the only behavior in Spark 2.x and it is compatible with Hive. " + "With strict policy, Spark doesn't allow any possible precision loss or data truncation " + "in type coercion, e.g. converting `double` to `int` or `decimal` to `double` is " + "not allowed." ) .stringConf .transform(_.toUpperCase(Locale.ROOT)) .checkValues(StoreAssignmentPolicy.values.map(_.toString)) .createWithDefault(StoreAssignmentPolicy.ANSI.toString)
看下配置有三種類型
object StoreAssignmentPolicy extends Enumeration { val ANSI, LEGACY, STRICT = Value }
對於ANSI策略,Spark根據ANSI SQL執行類型強制。這種行為基本上與PostgreSQL相同
它不允許某些不合理的類型轉換,如轉換“`string`to`int`或`double` to`boolean`
對於LEGACY策略 Spark允許類型強制,只要它是有效的'Cast' 這也是Spark 2.x中的唯一行為,它與Hive兼容。
對於STRICT策略 Spark不允許任何可能的精度損失或數據截斷
所以我們增加配置
spark.sql.storeAssignmentPolicy=LEGACY
之后能正常運行
