1.建表時的空值問題
如果我們建表時,不特殊說明空值,比如:
CREATE TABLE test.table1( id String, name String ) ENGINE = MergeTree PARTITION BY id ORDER BY id SETTINGS index_granularity = 8192
這種情況下,如果將包含空值的數據,讀入到表中時,會報錯。
DB::Exception: Expression returns value NULL, that is out of range of type String, at: null)
因此,為了防止這種情況的發生,我們一般會這樣建表:
注意:這里的主鍵是不可以包含空值的,如果把主鍵也加Nullable會報錯
CREATE TABLE test.table1( id String, name Nullable(String) ) ENGINE = MergeTree PARTITION BY id ORDER BY id SETTINGS index_granularity = 8192
2.查詢時的空值問題
上面說了建表的問題,接下來要實例一下,當我們表已經建好,且表數據已經有了,一列數據既包含null,又包含''這類空值,這個時候,如果不注意語法,會報錯,如果包含這兩類數據,不能使用coalesce,如下:
SELECT COALESCE ( paymentterm, 0 ) AS paymentterm_a, count( DISTINCT orderid ) AS ornum FROM ckdb.test WHERE d = '2020-05-08' GROUP BY paymentterm_a
報錯如下:錯誤原因是paymentterm是string類型,不可以改成int類型
Code: 386, e.displayText() = DB::Exception: There is no supertype for types String, UInt8 because some of them are String/FixedString and some of them are not (version 19.17.6.36 (official build))
這里有一個小的知識點:
group by后面的名稱,可以寫select中的邏輯,也可以寫as為的別名,下面使用case when改寫上面的內容:
--方式一 select case when paymentterm is null or paymentterm = '' then 'null' else paymentterm end as paymentterm, count(distinct orderid) as ornum from ckdb.test where d = '2020-05-08' group by paymentterm --方式二 select case when paymentterm is null or paymentterm = '' then 'null' else paymentterm end as paymentterm, count(distinct orderid) as ornum from ckdb.test where d = '2020-05-08' group by case when paymentterm is null or paymentterm = '' then 'null' else paymentterm end --方式三 select coalesce(paymentterm,'null') as paymentterm, count(distinct orderid) as ornum from ckdb.test where d = '2020-05-08' group by coalesce(paymentterm,'null')
3.關聯中的空值問題
如下場景,需要使用a表關聯b表,把a和b都有的id剔除,在hive中我們一般這樣實現:
select a.* from a left join b on a.id = b.id where b.id is null
不過這種方式,在CK中是有問題的,未連接的行使用默認值填充的。String類型就填充空字符串,數值類型就填充 0,要借用其他方式解決
1)使用coalesce來完成
select a.* from a left join b on a.id = b.id where coalesce(b.id,0) = 0
2)使用 settings join_use_nulls 來完成
修改參數,在 SQL 最后加入 settings join_use_nulls = 1
select * from st_center.test_join_1 as t1 all left join st_center.test_join_2 as t2 on t1.id = t2.id settings join_use_nulls = 1
注意:關於jdbc相關調用方式,導致settings無法應用,可以設置users.xml,
<join_use_nulls>1</join_use_nulls>
設置完成后,相關系統表內容也會更新,后續使用不用再單獨指定。(可指定profiles組,注意使用賬號所屬profiles組)
放置位置參考:
SELECT * FROM system.settings WHERE name = 'join_use_nulls' Query id: b5f9d42e-6e02-47ed-b261-fb04562ba18b ┌─name───────────┬─value─┬─changed─┬─description─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─min──┬─max──┬─readonly─┬─type─┐ │ join_use_nulls │ 1 │ 1 │ Use NULLs for non-joined rows of outer JOINs for types that can be inside Nullable. If false, use default value of corresponding columns data type. │ ???? │ ???? │ 0 │ Bool │ └────────────────┴───────┴─────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴──────┴──────┴──────────┴──────┘ 1 rows in set. Elapsed: 0.003 sec.