最近在Percona的blog上看到一篇文章:Column Store Database Benchmarks: MariaDB ColumnStore vs. Clickhouse vs. Apache Spark,從中可以看到Clickhouse的性能完爆MariaDB ColumnStore和 Spark。於是對Clickhouse產生了濃厚的興趣,所以也打算進行學習。目前Clickhouse還沒有中文文檔,學習起來還是有點費勁。Percona的另一篇博客介紹Clickhouse的也可以看看。ClickHouse: New Open Source Columnar Database,其中這里也有一篇文章也可以看看:彪悍開源的分析數據庫-ClickHouse
那么ClickHouse到底是啥?
1. 開源的列存儲數據庫管理系統
2. 支持線性擴展
3. 簡單方便
4. 高可靠性
5. 容錯(支持多主機異步復制,可以跨多個數據中心部署。 單個節點或整個數據中心的停機時間不會影響系統的讀寫可用性)
ClickHouse關鍵功能和應用場景:
更加詳細的可以參考官方文檔。
目前ClickHouse對ubuntu系統支持比較友好,對於centos就差一點。Ubuntu有deb包可以直接安裝。對於centos的則需要自己編譯。本人在折騰很久都沒有編譯出來,最后放棄了。無意中看到了google郵件組中有人提到RPM包,有人搞了yum源,可以直接安裝,這才解放。對於想自己編譯的同學,可以參考:https://github.com/yandex/ClickHouse/blob/master/doc/build.md,下面進行yum安裝
1. 添加yum源
yum-config-manager --add-repo http://repo.red-soft.biz/repos/clickhouse/repo/clickhouse-el6.repo
2. 安裝:
yum install clickhouse-server clickhouse-client clickhouse-server-common clickhouse-compressor
3. 添加用戶clickhouse
useradd clickhouse
4. 啟動clickhouse
/etc/init.d/clickhouse-server start
5. 登錄測試:
[root@db_server_yayun_01 ~]# clickhouse-client ClickHouse client version 1.1.54198. Connecting to localhost:9000. Connected to ClickHouse server version 1.1.54198. :) select 1 SELECT 1 ┌─1─┐ │ 1 │ └───┘ → Progress: 1.00 rows, 1.00 B (64.77 rows/s., 64.77 B/s.) 1 rows in set. Elapsed: 0.016 sec. :) select now(); SELECT now() ┌───────────────now()─┐ │ 2017-03-31 15:14:18 │ └─────────────────────┘ ↘ Progress: 1.00 rows, 1.00 B (216.22 rows/s., 216.22 B/s.) 1 rows in set. Elapsed: 0.005 sec. :)
啟動失敗可以查看日志,日志的目錄默認為
/var/log/clickhouse-server
[root@db_server_yayun_01 clickhouse-server]# ll total 16 -rw-rw-rw-. 1 clickhouse clickhouse 383 Mar 31 13:33 clickhouse-server.err.log -rw-rw-rw-. 1 clickhouse clickhouse 7733 Mar 31 15:14 clickhouse-server.log -rw-rw-rw-. 1 clickhouse clickhouse 138 Mar 31 13:33 stderr -rw-rw-rw-. 1 clickhouse clickhouse 0 Mar 31 13:33 stdout [root@db_server_yayun_01 clickhouse-server]#
下面說說clickhouse-client的簡單使用:
交互模式
clickhouse-client
clickhouse-client --host=... --port=... --user=... --password=...
啟用多行查詢:
clickhouse-client -m
clickhouse-client --multiline
對於建表的時候就需要啟用多行查詢,否則會報錯,比如建如下表:

CREATE TABLE `ontime` ( `Year` UInt16, `Quarter` UInt8, `Month` UInt8, `DayofMonth` UInt8, `DayOfWeek` UInt8, `FlightDate` Date, `UniqueCarrier` FixedString(7), `AirlineID` Int32, `Carrier` FixedString(2), `TailNum` String, `FlightNum` String, `OriginAirportID` Int32, `OriginAirportSeqID` Int32, `OriginCityMarketID` Int32, `Origin` FixedString(5), `OriginCityName` String, `OriginState` FixedString(2), `OriginStateFips` String, `OriginStateName` String, `OriginWac` Int32, `DestAirportID` Int32, `DestAirportSeqID` Int32, `DestCityMarketID` Int32, `Dest` FixedString(5), `DestCityName` String, `DestState` FixedString(2), `DestStateFips` String, `DestStateName` String, `DestWac` Int32, `CRSDepTime` Int32, `DepTime` Int32, `DepDelay` Int32, `DepDelayMinutes` Int32, `DepDel15` Int32, `DepartureDelayGroups` String, `DepTimeBlk` String, `TaxiOut` Int32, `WheelsOff` Int32, `WheelsOn` Int32, `TaxiIn` Int32, `CRSArrTime` Int32, `ArrTime` Int32, `ArrDelay` Int32, `ArrDelayMinutes` Int32, `ArrDel15` Int32, `ArrivalDelayGroups` Int32, `ArrTimeBlk` String, `Cancelled` UInt8, `CancellationCode` FixedString(1), `Diverted` UInt8, `CRSElapsedTime` Int32, `ActualElapsedTime` Int32, `AirTime` Int32, `Flights` Int32, `Distance` Int32, `DistanceGroup` UInt8, `CarrierDelay` Int32, `WeatherDelay` Int32, `NASDelay` Int32, `SecurityDelay` Int32, `LateAircraftDelay` Int32, `FirstDepTime` String, `TotalAddGTime` String, `LongestAddGTime` String, `DivAirportLandings` String, `DivReachedDest` String, `DivActualElapsedTime` String, `DivArrDelay` String, `DivDistance` String, `Div1Airport` String, `Div1AirportID` Int32, `Div1AirportSeqID` Int32, `Div1WheelsOn` String, `Div1TotalGTime` String, `Div1LongestGTime` String, `Div1WheelsOff` String, `Div1TailNum` String, `Div2Airport` String, `Div2AirportID` Int32, `Div2AirportSeqID` Int32, `Div2WheelsOn` String, `Div2TotalGTime` String, `Div2LongestGTime` String, `Div2WheelsOff` String, `Div2TailNum` String, `Div3Airport` String, `Div3AirportID` Int32, `Div3AirportSeqID` Int32, `Div3WheelsOn` String, `Div3TotalGTime` String, `Div3LongestGTime` String, `Div3WheelsOff` String, `Div3TailNum` String, `Div4Airport` String, `Div4AirportID` Int32, `Div4AirportSeqID` Int32, `Div4WheelsOn` String, `Div4TotalGTime` String, `Div4LongestGTime` String, `Div4WheelsOff` String, `Div4TailNum` String, `Div5Airport` String, `Div5AirportID` Int32, `Div5AirportSeqID` Int32, `Div5WheelsOn` String, `Div5TotalGTime` String, `Div5LongestGTime` String, `Div5WheelsOff` String, `Div5TailNum` String ) ENGINE = MergeTree(FlightDate, (Year, FlightDate), 8192)
以批處理方式運行查詢:
clickhouse-client --query='SELECT 1' echo 'SELECT 1' | clickhouse-client
從指定格式的文件插入數據:
clickhouse-client --query='INSERT INTO table VALUES' < data.txt clickhouse-client --query='INSERT INTO table FORMAT TabSeparated' < data.tsv
參考資料:
https://github.com/redsoftbiz/clickhouse-rpm