ClickHouse之初步認識


最近在Percona的blog上看到一篇文章:Column Store Database Benchmarks: MariaDB ColumnStore vs. Clickhouse vs. Apache Spark,從中可以看到Clickhouse的性能完爆MariaDB ColumnStore和 Spark。於是對Clickhouse產生了濃厚的興趣,所以也打算進行學習。目前Clickhouse還沒有中文文檔,學習起來還是有點費勁。Percona的另一篇博客介紹Clickhouse的也可以看看。ClickHouse: New Open Source Columnar Database,其中這里也有一篇文章也可以看看:彪悍開源的分析數據庫-ClickHouse

那么ClickHouse到底是啥?

1. 開源的列存儲數據庫管理系統

2. 支持線性擴展

3. 簡單方便

4. 高可靠性

5. 容錯(支持多主機異步復制,可以跨多個數據中心部署。 單個節點或整個數據中心的停機時間不會影響系統的讀寫可用性)

ClickHouse關鍵功能和應用場景:

 

更加詳細的可以參考官方文檔。

目前ClickHouse對ubuntu系統支持比較友好,對於centos就差一點。Ubuntu有deb包可以直接安裝。對於centos的則需要自己編譯。本人在折騰很久都沒有編譯出來,最后放棄了。無意中看到了google郵件組中有人提到RPM包,有人搞了yum源,可以直接安裝,這才解放。對於想自己編譯的同學,可以參考:https://github.com/yandex/ClickHouse/blob/master/doc/build.md,下面進行yum安裝

1. 添加yum源

yum-config-manager --add-repo http://repo.red-soft.biz/repos/clickhouse/repo/clickhouse-el6.repo

2.  安裝:

yum install clickhouse-server clickhouse-client clickhouse-server-common clickhouse-compressor

3. 添加用戶clickhouse

useradd clickhouse

4. 啟動clickhouse

/etc/init.d/clickhouse-server start

5. 登錄測試:

[root@db_server_yayun_01 ~]# clickhouse-client 
ClickHouse client version 1.1.54198.
Connecting to localhost:9000.
Connected to ClickHouse server version 1.1.54198.

:) select 1

SELECT 1

┌─1─┐
│ 1 │
└───┘
→ Progress: 1.00 rows, 1.00 B (64.77 rows/s., 64.77 B/s.) 
1 rows in set. Elapsed: 0.016 sec. 

:) select now();

SELECT now()

┌───────────────now()─┐
│ 2017-03-31 15:14:18 │
└─────────────────────┘
↘ Progress: 1.00 rows, 1.00 B (216.22 rows/s., 216.22 B/s.) 
1 rows in set. Elapsed: 0.005 sec. 

:) 

啟動失敗可以查看日志,日志的目錄默認為

/var/log/clickhouse-server
[root@db_server_yayun_01 clickhouse-server]# ll
total 16
-rw-rw-rw-. 1 clickhouse clickhouse  383 Mar 31 13:33 clickhouse-server.err.log
-rw-rw-rw-. 1 clickhouse clickhouse 7733 Mar 31 15:14 clickhouse-server.log
-rw-rw-rw-. 1 clickhouse clickhouse  138 Mar 31 13:33 stderr
-rw-rw-rw-. 1 clickhouse clickhouse    0 Mar 31 13:33 stdout
[root@db_server_yayun_01 clickhouse-server]# 

下面說說clickhouse-client的簡單使用:

交互模式

clickhouse-client
clickhouse-client --host=... --port=... --user=... --password=...

啟用多行查詢:

clickhouse-client -m
clickhouse-client --multiline

對於建表的時候就需要啟用多行查詢,否則會報錯,比如建如下表:

CREATE TABLE `ontime` (
  `Year` UInt16,
  `Quarter` UInt8,
  `Month` UInt8,
  `DayofMonth` UInt8,
  `DayOfWeek` UInt8,
  `FlightDate` Date,
  `UniqueCarrier` FixedString(7),
  `AirlineID` Int32,
  `Carrier` FixedString(2),
  `TailNum` String,
  `FlightNum` String,
  `OriginAirportID` Int32,
  `OriginAirportSeqID` Int32,
  `OriginCityMarketID` Int32,
  `Origin` FixedString(5),
  `OriginCityName` String,
  `OriginState` FixedString(2),
  `OriginStateFips` String,
  `OriginStateName` String,
  `OriginWac` Int32,
  `DestAirportID` Int32,
  `DestAirportSeqID` Int32,
  `DestCityMarketID` Int32,
  `Dest` FixedString(5),
  `DestCityName` String,
  `DestState` FixedString(2),
  `DestStateFips` String,
  `DestStateName` String,
  `DestWac` Int32,
  `CRSDepTime` Int32,
  `DepTime` Int32,
  `DepDelay` Int32,
  `DepDelayMinutes` Int32,
  `DepDel15` Int32,
  `DepartureDelayGroups` String,
  `DepTimeBlk` String,
  `TaxiOut` Int32,
  `WheelsOff` Int32,
  `WheelsOn` Int32,
  `TaxiIn` Int32,
  `CRSArrTime` Int32,
  `ArrTime` Int32,
  `ArrDelay` Int32,
  `ArrDelayMinutes` Int32,
  `ArrDel15` Int32,
  `ArrivalDelayGroups` Int32,
  `ArrTimeBlk` String,
  `Cancelled` UInt8,
  `CancellationCode` FixedString(1),
  `Diverted` UInt8,
  `CRSElapsedTime` Int32,
  `ActualElapsedTime` Int32,
  `AirTime` Int32,
  `Flights` Int32,
  `Distance` Int32,
  `DistanceGroup` UInt8,
  `CarrierDelay` Int32,
  `WeatherDelay` Int32,
  `NASDelay` Int32,
  `SecurityDelay` Int32,
  `LateAircraftDelay` Int32,
  `FirstDepTime` String,
  `TotalAddGTime` String,
  `LongestAddGTime` String,
  `DivAirportLandings` String,
  `DivReachedDest` String,
  `DivActualElapsedTime` String,
  `DivArrDelay` String,
  `DivDistance` String,
  `Div1Airport` String,
  `Div1AirportID` Int32,
  `Div1AirportSeqID` Int32,
  `Div1WheelsOn` String,
  `Div1TotalGTime` String,
  `Div1LongestGTime` String,
  `Div1WheelsOff` String,
  `Div1TailNum` String,
  `Div2Airport` String,
  `Div2AirportID` Int32,
  `Div2AirportSeqID` Int32,
  `Div2WheelsOn` String,
  `Div2TotalGTime` String,
  `Div2LongestGTime` String,
  `Div2WheelsOff` String,
  `Div2TailNum` String,
  `Div3Airport` String,
  `Div3AirportID` Int32,
  `Div3AirportSeqID` Int32,
  `Div3WheelsOn` String,
  `Div3TotalGTime` String,
  `Div3LongestGTime` String,
  `Div3WheelsOff` String,
  `Div3TailNum` String,
  `Div4Airport` String,
  `Div4AirportID` Int32,
  `Div4AirportSeqID` Int32,
  `Div4WheelsOn` String,
  `Div4TotalGTime` String,
  `Div4LongestGTime` String,
  `Div4WheelsOff` String,
  `Div4TailNum` String,
  `Div5Airport` String,
  `Div5AirportID` Int32,
  `Div5AirportSeqID` Int32,
  `Div5WheelsOn` String,
  `Div5TotalGTime` String,
  `Div5LongestGTime` String,
  `Div5WheelsOff` String,
  `Div5TailNum` String
) ENGINE = MergeTree(FlightDate, (Year, FlightDate), 8192)
View Code

以批處理方式運行查詢:

clickhouse-client --query='SELECT 1'
echo 'SELECT 1' | clickhouse-client

從指定格式的文件插入數據:

clickhouse-client --query='INSERT INTO table VALUES' < data.txt
clickhouse-client --query='INSERT INTO table FORMAT TabSeparated' < data.tsv

 

 

參考資料:

https://github.com/redsoftbiz/clickhouse-rpm

https://clickhouse.yandex/

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM