clickhouse常見的三種空值問題以及解決方案


1.建表時的空值問題
如果我們建表時,不特殊說明空值,比如:

CREATE TABLE test.table1(
  id String,
  name String
) ENGINE = MergeTree PARTITION BY id ORDER BY id SETTINGS index_granularity = 8192


這種情況下,如果將包含空值的數據,讀入到表中時,會報錯。

DB::Exception: Expression returns value NULL, that is out of range of type String, at: null)
因此,為了防止這種情況的發生,我們一般會這樣建表:

注意:這里的主鍵是不可以包含空值的,如果把主鍵也加Nullable會報錯

CREATE TABLE test.table1(
  id String,
  name Nullable(String)
) ENGINE = MergeTree PARTITION BY id ORDER BY id SETTINGS index_granularity = 8192

 

2.查詢時的空值問題
上面說了建表的問題,接下來要實例一下,當我們表已經建好,且表數據已經有了,一列數據既包含null,又包含''這類空值,這個時候,如果不注意語法,會報錯,如果包含這兩類數據,不能使用coalesce,如下:

SELECT COALESCE
    ( paymentterm, 0 ) AS paymentterm_a,
    count( DISTINCT orderid ) AS ornum 
FROM
    ckdb.test 
WHERE
    d = '2020-05-08' 
GROUP BY
    paymentterm_a

報錯如下:錯誤原因是paymentterm是string類型,不可以改成int類型

Code: 386, e.displayText() = DB::Exception: There is no supertype for types String, UInt8 because some of them are String/FixedString and some of them are not (version 19.17.6.36 (official build))
這里有一個小的知識點:

group by后面的名稱,可以寫select中的邏輯,也可以寫as為的別名,下面使用case when改寫上面的內容:

--方式一
select case when paymentterm is null or paymentterm = '' then 'null' else paymentterm end as paymentterm,
       count(distinct orderid) as ornum
  from ckdb.test
 where d = '2020-05-08'
 group by paymentterm
 
--方式二
select case when paymentterm is null or paymentterm = '' then 'null' else paymentterm end as paymentterm,
       count(distinct orderid) as ornum
  from ckdb.test
 where d = '2020-05-08'
 group by case when paymentterm is null or paymentterm = '' then 'null' else paymentterm end
 
--方式三
select coalesce(paymentterm,'null') as paymentterm,
       count(distinct orderid) as ornum
  from ckdb.test
 where d = '2020-05-08'
 group by coalesce(paymentterm,'null')
 

 

3.關聯中的空值問題
如下場景,需要使用a表關聯b表,把a和b都有的id剔除,在hive中我們一般這樣實現:

select a.*
from a
left join b
on a.id = b.id
where b.id is null

不過這種方式,在CK中是有問題的,未連接的行使用默認值填充的。String類型就填充空字符串,數值類型就填充 0,要借用其他方式解決

1)使用coalesce來完成

select a.*
from a
left join b
on a.id = b.id
where coalesce(b.id,0) = 0

2)使用 settings join_use_nulls 來完成

修改參數,在 SQL 最后加入 settings join_use_nulls = 1

select * from st_center.test_join_1  as t1
all left join st_center.test_join_2  as t2
on t1.id = t2.id
settings join_use_nulls = 1

 注意:關於jdbc相關調用方式,導致settings無法應用,可以設置users.xml,

<join_use_nulls>1</join_use_nulls>

設置完成后,相關系統表內容也會更新,后續使用不用再單獨指定。(可指定profiles組,注意使用賬號所屬profiles組)

放置位置參考:

 

SELECT *
FROM system.settings
WHERE name = 'join_use_nulls'

Query id: b5f9d42e-6e02-47ed-b261-fb04562ba18b

┌─name───────────┬─value─┬─changed─┬─description─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─min──┬─max──┬─readonly─┬─type─┐
│ join_use_nulls │ 11Use NULLs for non-joined rows of outer JOINs for types that can be inside Nullable. If false, use default value of corresponding columns data type. │ ???? │ ???? │        0 │ Bool │
└────────────────┴───────┴─────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴──────┴──────┴──────────┴──────┘

1 rows in set. Elapsed: 0.003 sec.

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM