starRocks安裝測試報告


starRocks安裝測試報告

下載安裝部署

StarRocks的集群部署分為兩種模式,第一種是使用命令部署,第二種是使用 StarRocksManager 自動化部署。自動部署的版本只需要在頁面上簡單進行配置、選擇、輸入后批量完成,並且包含Supervisor進程管理、滾動升級、備份、回滾等功能。因 StarRocksManager並未開源,因此我們只能使用命令部署。

生產環境使用官方推薦配置:BE推薦16核64GB以上(StarRocks的元數據都在內存中保存),FE推薦8核16GB以上

下載地址:

https://www.starrocks.com/zh-CN/download/request-download/4

服務器配置

下載社區版最新版本:StarRocks-1.19.1,因資源限制(使用4台虛機),FE采取單實例部署,BE部署三個實例。

IP host 配置 部署
192.168.130.178 fe1 4C16G FE主節點
192.168.130.183 be1 4C8G BE節點
192.168.130.36 be2 4C8G BE節點
192.168.130.149 be3 4C8G BE節點

FE單實例部署

# 修改fe配置
cd StarRocks-1.19.1/fe

# 配置文件conf/fe.conf
# 根據FE內存大小調整 -Xmx4096m,為了避免GC建議16G以上,StarRocks的元數據都在內存中保存。

# 創建元數據目錄
mkdir -p meta
# 啟動進程
bin/start_fe.sh --daemon

# 使用瀏覽器訪問8030端口, 打開StarRocks的WebUI, 用戶名為root, 密碼為空

StarRocks UI

http://192.168.130.178:8030/ 打開StarRocks的WebUI,,用戶名為root, 密碼為空。

使用MySQL客戶端訪問FE

第一步: 安裝mysql客戶端(如果已經安裝,可忽略此步):

Ubuntu:sudo apt-get install mysql-client

Centos:sudo yum install mysql-client

wget http://repo.mysql.com/mysql57-community-release-sles12.rpm

rpm -ivh mysql57-community-release-sles12.rpm

# 安裝mysql
yum install mysql-server

# 啟動mysql
service mysqld start

第二步: 使用mysql客戶端連接:

mysql -h 127.0.0.1 -P9030 -uroot

注意:這里默認root用戶密碼為空,端口為fe/conf/fe.conf中的query_port配置項,默認為9030

第三步: 查看FE狀態:

MySQL [(none)]> SHOW PROC '/frontends'\G
*************************** 1. row ***************************
             Name: 192.168.130.178_9010_1636811945380
               IP: 192.168.130.178
         HostName: fe1
      EditLogPort: 9010
         HttpPort: 8030
        QueryPort: 9030
          RpcPort: 9020
             Role: FOLLOWER
         IsMaster: true
        ClusterId: 985620692
             Join: true
            Alive: true
ReplayedJournalId: 43507
    LastHeartbeat: 2021-11-15 14:41:48
         IsHelper: true
           ErrMsg: 
1 row in set (0.05 sec)

Role為FOLLOWER說明這是一個能參與選主的FE;IsMaster為true,說明該FE當前為主節點。

如果MySQL客戶端連接不成功,請查看log/fe.warn.log日志文件,確認問題。由於是初次啟動,如果在操作過程中遇到任何意外問題,都可以刪除並重新創建FE的元數據目錄,再從頭開始操作。

BE部署

BE的基本配置

BE的配置文件為StarRocks-1.19.1/be/conf/be.conf,默認配置已經足以啟動集群,不建議初嘗用戶修改配置, 有經驗的用戶可以查看手冊的系統配置章節,為生產環境定制配置。 為了讓用戶更好的理解集群的工作原理, 此處只列出基礎配置。

192.168.130.183

# 修改fe配置
cd StarRocks-1.19.1/be

# 配置文件conf/fe.conf
# 調整BE參數,默認配置已經足以啟動集群,暫不做調整。

# 創建元數據目錄
mkdir -p storage

# 通過mysql客戶端添加BE節點。
# 這里IP地址為和priority_networks設置匹配的IP,portheartbeat_service_port,默認為9050
mysql> ALTER SYSTEM ADD BACKEND "be1:9050";

# 啟動be
bin/start_be.sh --daemon

添加BE節點如出現錯誤,需要刪除BE節點,應用下列命令:

  • alter system decommission backend "be_host:be_heartbeat_service_port";

  • alter system dropp backend "be_host:be_heartbeat_service_port";

# 查看防火牆狀態
[root@be1 be]# systemctl status firewalld
# 如已開啟,需關閉防火牆,防止網絡不通,導致BE和FE無法連接
[root@be1 be]# systemctl stop firewalld

第四步: 查看BE狀態, 確認BE就緒。同樣步驟添加另外兩個BE節點,mysql客戶端中執行

MySQL [(none)]> SHOW PROC '/backends' \G;
*************************** 1. row ***************************
            BackendId: 163038
              Cluster: default_cluster
                   IP: 192.168.130.183
             HostName: be1
        HeartbeatPort: 9050
               BePort: 9060
             HttpPort: 8040
             BrpcPort: 8060
        LastStartTime: 2021-11-21 13:56:47
        LastHeartbeat: 2021-11-21 14:58:35
                Alive: true
 SystemDecommissioned: false
ClusterDecommissioned: false
            TabletNum: 5
     DataUsedCapacity: .000 
        AvailCapacity: 33.451 GB
        TotalCapacity: 36.974 GB
              UsedPct: 9.53 %
       MaxDiskUsedPct: 9.53 %
               ErrMsg: 
              Version: 1.19.1-65e87c3
               Status: {"lastSuccessReportTabletsTime":"2021-11-21 14:57:48"}
    DataTotalCapacity: 33.451 GB
          DataUsedPct: 0.00 %
*************************** 2. row ***************************
            BackendId: 163066
              Cluster: default_cluster
                   IP: 192.168.130.36
             HostName: be2
        HeartbeatPort: 9050
               BePort: 9060
             HttpPort: 8040
             BrpcPort: 8060
        LastStartTime: 2021-11-21 14:56:34
        LastHeartbeat: 2021-11-21 14:58:35
                Alive: true
 SystemDecommissioned: false
ClusterDecommissioned: false
            TabletNum: 5
     DataUsedCapacity: .000 
        AvailCapacity: 33.452 GB
        TotalCapacity: 36.974 GB
              UsedPct: 9.53 %
       MaxDiskUsedPct: 9.53 %
               ErrMsg: 
              Version: 1.19.1-65e87c3
               Status: {"lastSuccessReportTabletsTime":"2021-11-21 14:58:35"}
    DataTotalCapacity: 33.452 GB
          DataUsedPct: 0.00 %
*************************** 3. row ***************************
            BackendId: 163072
              Cluster: default_cluster
                   IP: 192.168.130.149
             HostName: be3
        HeartbeatPort: 9050
               BePort: 9060
             HttpPort: 8040
             BrpcPort: 8060
        LastStartTime: 2021-11-21 14:58:15
        LastHeartbeat: 2021-11-21 14:58:35
                Alive: true
 SystemDecommissioned: false
ClusterDecommissioned: false
            TabletNum: 3
     DataUsedCapacity: .000 
        AvailCapacity: 33.521 GB
        TotalCapacity: 36.974 GB
              UsedPct: 9.34 %
       MaxDiskUsedPct: 9.34 %
               ErrMsg: 
              Version: 1.19.1-65e87c3
               Status: {"lastSuccessReportTabletsTime":"2021-11-21 14:58:16"}
    DataTotalCapacity: 33.521 GB
          DataUsedPct: 0.00 %

如果isAlive為true,則說明BE正常接入集群。如果BE沒有正常接入集群,請查看log目錄下的be.WARNING日志文件確定原因。

至此,安裝完成。

參數設置

  • Swappiness

關閉交換區,消除交換內存到虛擬內存時對性能的擾動。

echo 0 | sudo tee /proc/sys/vm/swappiness
  • Compaction相關

當使用聚合表或更新模型,導入數據比較快的時候,可在配置文件 be.conf 中修改下列參數以加速compaction。

cumulative_compaction_num_threads_per_disk = 4
base_compaction_num_threads_per_disk = 2
cumulative_compaction_check_interval_seconds = 2
  • 並行度

在客戶端執行命令,修改StarRocks的並行度(類似clickhouse set max_threads= 8)。並行度可以設置為當前機器CPU核數的一半。

set global parallel_fragment_exec_instance_num =  8;

使用MySQL客戶端訪問StarRocks

Root用戶登錄

使用MySQL客戶端連接某一個FE實例的query_port(9030), StarRocks內置root用戶,密碼默認為空:

mysql -h fe_host -P9030 -u root

清理環境:

mysql > drop database if exists example_db;

mysql > drop user test;

創建新用戶

通過下面的命令創建一個普通用戶:

mysql > create user 'test_xxx' identified by 'xxx123456';

創建數據庫

StarRocks中root賬戶才有權建立數據庫,使用root用戶登錄,建立example_db數據庫:

mysql > create database test_xxx_db;

數據庫創建完成之后,可以通過show databases查看數據庫信息:

MySQL [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| _statistics_       |
| information_schema |
| test_xxx_db        |
+--------------------+
3 rows in set (0.00 sec)

information_schema是為了兼容mysql協議而存在,實際中信息可能不是很准確,所以關於具體數據庫的信息建議通過直接查詢相應數據庫而獲得。

賬戶授權

example_db創建完成之后,可以通過root賬戶example_db讀寫權限授權給test賬戶,授權之后采用test賬戶登錄就可以操作example_db數據庫了:

mysql > grant all on test_xxx_db to test_xxx;

退出root賬戶,使用test登錄StarRocks集群:

mysql > exit

mysql -h 127.0.0.1 -P9030 -utest_xxx -pxxx123456

建表

StarRocks支持支持單分區和復合分區兩種建表方式。

在復合分區中:

  • 第一級稱為Partition,即分區。用戶可以指定某一維度列作為分區列(當前只支持整型和時間類型的列),並指定每個分區的取值范圍。
  • 第二級稱為Distribution,即分桶。用戶可以指定某幾個維度列(或不指定,即所有KEY列)以及桶數對數據進行HASH分布。

以下場景推薦使用復合分區:

  • 有時間維度或類似帶有有序值的維度:可以以這類維度列作為分區列。分區粒度可以根據導入頻次、分區數據量等進行評估。
  • 歷史數據刪除需求:如有刪除歷史數據的需求(比如僅保留最近N 天的數據)。使用復合分區,可以通過刪除歷史分區來達到目的。也可以通過在指定分區內發送DELETE語句進行數據刪除。
  • 解決數據傾斜問題:每個分區可以單獨指定分桶數量。如按天分區,當每天的數據量差異很大時,可以通過指定分區的分桶數,合理划分不同分區的數據,分桶列建議選擇區分度大的列。

用戶也可以不使用復合分區,即使用單分區。則數據只做HASH分布。

下面分別演示兩種分區的建表語句:

  1. 首先切換數據庫:mysql > use test_xxx_db;
  2. 建立單分區表建立一個名字為table1的邏輯表。使用全hash分桶,分桶列為siteid,桶數為10。這個表的schema如下:
  • siteid:類型是INT(4字節), 默認值為10
  • city_code:類型是SMALLINT(2字節)
  • username:類型是VARCHAR, 最大長度為32, 默認值為空字符串
  • pv:類型是BIGINT(8字節), 默認值是0; 這是一個指標列, StarRocks內部會對指標列做聚合操作, 這個列的聚合方法是求和(SUM)。這里采用了聚合模型,除此之外StarRocks還支持明細模型和更新模型,具體參考數據模型介紹

建表語句如下:

mysql >
CREATE TABLE table1
(
    siteid INT DEFAULT '10',
    citycode SMALLINT,
    username VARCHAR(32) DEFAULT '',
    pv BIGINT SUM DEFAULT '0'
)
AGGREGATE KEY(siteid, citycode, username)
DISTRIBUTED BY HASH(siteid) BUCKETS 10
PROPERTIES("replication_num" = "1");
  1. 建立復合分區表

建立一個名字為table2的邏輯表。這個表的 schema 如下:

  • event_day:類型是DATE,無默認值
  • siteid:類型是INT(4字節), 默認值為10
  • city_code:類型是SMALLINT(2字節)
  • username:類型是VARCHAR, 最大長度為32, 默認值為空字符串
  • pv:類型是BIGINT(8字節), 默認值是0; 這是一個指標列, StarRocks 內部會對指標列做聚合操作, 這個列的聚合方法是求和(SUM)

我們使用event_day列作為分區列,建立3個分區: p1, p2, p3

  • p1:范圍為 [最小值, 2017-06-30)
  • p2:范圍為 [2017-06-30, 2017-07-31)
  • p3:范圍為 [2017-07-31, 2017-08-31)

每個分區使用siteid進行哈希分桶,桶數為10。

建表語句如下:

CREATE TABLE table2
(
event_day DATE,
siteid INT DEFAULT '10',
citycode SMALLINT,
username VARCHAR(32) DEFAULT '',
pv BIGINT SUM DEFAULT '0'
)
AGGREGATE KEY(event_day, siteid, citycode, username)
PARTITION BY RANGE(event_day)
(
PARTITION p1 VALUES LESS THAN ('2017-06-30'),
PARTITION p2 VALUES LESS THAN ('2017-07-31'),
PARTITION p3 VALUES LESS THAN ('2017-08-31')
)
DISTRIBUTED BY HASH(siteid) BUCKETS 10
PROPERTIES("replication_num" = "1");

表建完之后,可以查看example_db中表的信息:

mysql> show tables;

+-------------------------+
| Tables_in_example_db    |
+-------------------------+
| table1                  |
| table2                  |
+-------------------------+
2 rows in set (0.01 sec)

  <br/>

mysql> desc table1;

+----------+-------------+------+-------+---------+-------+
| Field    | Type        | Null | Key   | Default | Extra |
+----------+-------------+------+-------+---------+-------+
| siteid   | int(11)     | Yes  | true  | 10      |       |
| citycode | smallint(6) | Yes  | true  | N/A     |       |
| username | varchar(32) | Yes  | true  |         |       |
| pv       | bigint(20)  | Yes  | false | 0       | SUM   |
+----------+-------------+------+-------+---------+-------+
4 rows in set (0.00 sec)

  <br/>

mysql> desc table2;

+-----------+-------------+------+-------+---------+-------+
| Field     | Type        | Null | Key   | Default | Extra |
+-----------+-------------+------+-------+---------+-------+
| event_day | date        | Yes  | true  | N/A     |       |
| siteid    | int(11)     | Yes  | true  | 10      |       |
| citycode  | smallint(6) | Yes  | true  | N/A     |       |
| username  | varchar(32) | Yes  | true  |         |       |
| pv        | bigint(20)  | Yes  | false | 0       | SUM   |
+-----------+-------------+------+-------+---------+-------+
5 rows in set (0.00 sec)

MySQL [(none)]> use test_xxx_db;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MySQL [test_xxx_db]> insert into table1 values(1,3708,'zhaop',0);
Query OK, 1 row affected (0.20 sec)
{'label':'insert_d7aee4d4-52a4-11ec-be98-fa163e7a663b', 'status':'VISIBLE', 'txnId':'2005'}


MySQL [test_xxx_db]> select * from table1;
+--------+----------+----------+------+
| siteid | citycode | username | pv   |
+--------+----------+----------+------+
|      1 |     3708 | zhaop    |    0 |
+--------+----------+----------+------+
1 row in set (0.03 sec)

導入數據

curl --location-trusted -u test_xxx:xxx123456 -T table1_data -H "label: table1_20211121" \
    -H "column_separator:," \
    http://127.0.0.1:8030/api/test_xxx_db/table1/_stream_load
    
## 報錯
curl: Can't open 'table1_data'!
curl: try 'curl --help' or 'curl --manual' for more information

問題

  1. 執行sql報錯
MySQL [test_xxx_db]> select * from table1;
ERROR 1064 (HY000): Could not initialize class com.starrocks.rpc.BackendServiceProxy

后台日志:

2021-11-21 16:03:49,835 WARN (starrocks-mysql-nio-pool-31|379) [StmtExecutor.execute():456] execute Exception, sql select * from table1
java.lang.NoClassDefFoundError: Could not initialize class com.starrocks.rpc.BackendServiceProxy
	at com.starrocks.qe.Coordinator$BackendExecState.execRemoteFragmentAsync(Coordinator.java:1695) ~[starrocks-fe.jar:?]
	at com.starrocks.qe.Coordinator.exec(Coordinator.java:522) ~[starrocks-fe.jar:?]
	at com.starrocks.qe.StmtExecutor.handleQueryStmt(StmtExecutor.java:771) ~[starrocks-fe.jar:?]
	at com.starrocks.qe.StmtExecutor.execute(StmtExecutor.java:371) ~[starrocks-fe.jar:?]
	at com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:248) ~[starrocks-fe.jar:?]
	at com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:397) ~[starrocks-fe.jar:?]
	at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessor.java:633) ~[starrocks-fe.jar:?]
	at com.starrocks.mysql.nio.ReadListener.lambda$handleEvent$480(ReadListener.java:54) ~[starrocks-fe.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_312]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_312]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_312]

解決方案:openjdk改為

[root@fe1 jre-1.8.0-openjdk]# java -version
openjdk version "1.8.0_312"
OpenJDK Runtime Environment (build 1.8.0_312-b07)
OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)

# 修改為
[root@fe1 java]# java -version
java version "1.8.0_192"
Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mod
  1. BE啟動失敗,java.net.NoRouteToHostException: No route to host (Host unreachable)
# 報錯信息
W1128 23:13:56.493264 24083 utils.cpp:90] Fail to get master client from cache. host= port=0 code=THRIFT_RPC_ERROR
# 查看防火牆狀態
[root@be1 be]# systemctl status firewalld
# 如已開啟,需關閉防火牆,防止網絡不通,導致BE和FE無法連接
[root@be1 be]# systemctl stop firewalld

至此,starRocks可以進行正常的增刪改查操作。

性能測試

共執行了4類場景,13條查詢語句,分別為單表查詢和多表查詢,結果為毫秒,並發為1。

數據創建

下載ssb-poc工具包並編譯

mkdir poc
wget https://starrocks-public.oss-cn-zhangjiakou.aliyuncs.com/ssb-poc-0.9.2.zip
unzip ssb-poc-0.9.2.zip
cd ssb-poc
make && make install  
# 編譯成功之后,可以看到output目錄
[root@fe1 ssb-poc]# cd output/
[root@fe1 output]# ll
total 0
drwxr-xr-x. 2 root root 197 Nov 28 21:01 bin
drwxr-xr-x. 2 root root  37 Nov 28 21:01 conf
drwxr-xr-x. 3 root root  22 Nov 28 21:01 lib
drwxr-xr-x. 4 root root  39 Nov 28 21:01 share

所有相關工具都會安裝到output目錄

生成數據

bin/gen-ssb.sh 30 data_dir

這里會在data_dir目錄下生成30GB規模的數據

導入數據

安裝python3

壓測腳本需要使用python3,先安裝python3,參考https://www.jianshu.com/p/a916a22de3eb。詳細步驟不再贅述。

確定測試目錄

[root@fe1 output]# pwd
/root/starrocks/poc/ssb-poc/output

修改配置文件conf/doris.conf

[doris]
# for mysql cmd
mysql_host: fe1
mysql_port: 9030
mysql_user: root
mysql_password:
doris_db: ssb

# cluster ports
http_port: 8030
be_heartbeat_port: 9050
broker_port: 8000
...

執行建表腳本

bin/create_db_table.sh ddl_100

完成后我們在ssb下創建了6張表:lineorder, supplier, dates, customer, part, lineorder_flat

MySQL [ssb]> show tables;
+----------------+
| Tables_in_ssb  |
+----------------+
| customer       |
| dates          |
| lineorder      |
| lineorder_flat |
| part           |
| supplier       |
+----------------+
6 rows in set (0.00 sec)

數據導入

通過stream load導入數據

bin/stream_load.sh data_dir

data_dir是之前生成的數據目錄,最大表數據量1.8億

表名 數據量 備注
lineorder 179998372 SSB商品訂單表
supplier 60000 SSB客戶表
dates 2556 SSB 零部件表
customer 900000 SSB 供應商表
part 1000000 日期表
lineorder_flat 54675488 寬表

測試單表查詢

[root@fe1 output]# time bin/benchmark.sh -p -d ssb
------ dataset: ssb, concurrency: 1 ------
sql\time(ms)\parallel_num	1
q1	3466.0
q2	473.0
q3	464.0
q4	1703.0
q5	818.0
q6	621.0
q7	1807.0
q8	1039.0
q9	591.0
q10	506.0
q11	1865.0
q12	1077.0
q13	945.0
# 二次執行
------ dataset: ssb, concurrency: 1 ------
sql\time(ms)\parallel_num	1
q1	563.0
q2	466.0
q3	537.0
q4	936.0
q5	850.0
q6	590.0
q7	1597.0
q8	970.0
q9	542.0
q10	540.0
q11	1701.0
q12	1104.0
q13	1008.0

測試ssb寬表查詢

# 生成寬表數據
[root@fe1 output]# bin/flat_insert.sh
sql: ssb_flat_insert start
sql: ssb_flat_insert. flat insert error, msg: (1064, 'index channel has intoleralbe failure')

# 測試寬表性能
time bin/benchmark.sh -p -d ssb-flat
[root@fe1 output]# bin/benchmark.sh -p -d ssb-flat
------ dataset: ssb-flat, concurrency: 1 ------
sql\time(ms)\parallel_num	1
q1	6435.0
q2	165.0
q3	74.0
q4	7925.0
q5	5307.0
q6	4514.0
q7	7621.0
q8	6821.0
q9	4740.0
q10	116.0
q11	6568.0
q12	301.0
q13	82.0
[root@fe1 output]# time bin/benchmark.sh -p -d ssb-flat
------ dataset: ssb-flat, concurrency: 1 ------
sql\time(ms)\parallel_num	1
q1	5693.0
q2	98.0
q3	77.0
q4	5811.0
q5	4549.0
q6	4111.0
q7	6819.0
q8	6389.0
q9	4298.0
q10	143.0
q11	6583.0
q12	231.0
q13	75.0

real	0m51.565s
user	0m0.192s
sys	0m0.152s


結論

在1.8億條單表記錄,5500萬條寬表記錄情況下,通過SSB測試,多表join查詢性能較高,實測性能和官方提供的測試報告有所差距。主要原因可能有以下幾點:

  1. 測試使用為虛機,配置低(官方使用16核64G ESSD高效雲盤 10Gbits/s網絡帶寬)
  2. 未經過參數調優

測試SQL附錄

單表測試SQL

--Q1.1 
SELECT sum(lo_extendedprice * lo_discount) AS `revenue` 
FROM lineorder_flat 
WHERE lo_orderdate >= '1993-01-01' and lo_orderdate <= '1993-12-31' AND lo_discount BETWEEN 1 AND 3 AND lo_quantity < 25; 
 
--Q1.2 
SELECT sum(lo_extendedprice * lo_discount) AS revenue FROM lineorder_flat  
WHERE lo_orderdate >= '1994-01-01' and lo_orderdate <= '1994-01-31' AND lo_discount BETWEEN 4 AND 6 AND lo_quantity BETWEEN 26 AND 35; 
 
--Q1.3 
SELECT sum(lo_extendedprice * lo_discount) AS revenue 
FROM lineorder_flat 
WHERE weekofyear(lo_orderdate) = 6 AND lo_orderdate >= '1994-01-01' and lo_orderdate <= '1994-12-31' 
 AND lo_discount BETWEEN 5 AND 7 AND lo_quantity BETWEEN 26 AND 35; 
 
 
--Q2.1 
SELECT sum(lo_revenue), year(lo_orderdate) AS year,  p_brand 
FROM lineorder_flat 
WHERE p_category = 'MFGR#12' AND s_region = 'AMERICA' 
GROUP BY year,  p_brand 
ORDER BY year, p_brand; 
 
--Q2.2 
SELECT 
sum(lo_revenue), year(lo_orderdate) AS year, p_brand 
FROM lineorder_flat 
WHERE p_brand >= 'MFGR#2221' AND p_brand <= 'MFGR#2228' AND s_region = 'ASIA' 
GROUP BY year,  p_brand 
ORDER BY year, p_brand; 
  
--Q2.3 
SELECT sum(lo_revenue),  year(lo_orderdate) AS year, p_brand 
FROM lineorder_flat 
WHERE p_brand = 'MFGR#2239' AND s_region = 'EUROPE' 
GROUP BY  year,  p_brand 
ORDER BY year, p_brand; 
 
 
--Q3.1 
SELECT c_nation, s_nation,  year(lo_orderdate) AS year, sum(lo_revenue) AS revenue FROM lineorder_flat 
WHERE c_region = 'ASIA' AND s_region = 'ASIA' AND lo_orderdate  >= '1992-01-01' AND lo_orderdate   <= '1997-12-31' 
GROUP BY c_nation, s_nation, year 
ORDER BY  year ASC, revenue DESC; 
 
--Q3.2 
SELECT  c_city, s_city, year(lo_orderdate) AS year, sum(lo_revenue) AS revenue
FROM lineorder_flat 
WHERE c_nation = 'UNITED STATES' AND s_nation = 'UNITED STATES' AND lo_orderdate  >= '1992-01-01' AND lo_orderdate <= '1997-12-31' 
GROUP BY c_city, s_city, year 
ORDER BY year ASC, revenue DESC; 
 
--Q3.3 
SELECT c_city, s_city, year(lo_orderdate) AS year, sum(lo_revenue) AS revenue 
FROM lineorder_flat 
WHERE c_city in ( 'UNITED KI1' ,'UNITED KI5') AND s_city in ( 'UNITED KI1' ,'UNITED KI5') AND lo_orderdate  >= '1992-01-01' AND lo_orderdate <= '1997-12-31' 
GROUP BY c_city, s_city, year 
ORDER BY year ASC, revenue DESC; 
 
--Q3.4 
SELECT c_city, s_city, year(lo_orderdate) AS year, sum(lo_revenue) AS revenue 
FROM lineorder_flat 
WHERE c_city in ('UNITED KI1', 'UNITED KI5') AND s_city in ( 'UNITED KI1',  'UNITED KI5') AND  lo_orderdate  >= '1997-12-01' AND lo_orderdate <= '1997-12-31' 
GROUP BY c_city,  s_city, year 
ORDER BY year ASC, revenue DESC; 
 
 
--Q4.1 
SELECT year(lo_orderdate) AS year, c_nation,  sum(lo_revenue - lo_supplycost) AS profit FROM lineorder_flat 
WHERE c_region = 'AMERICA' AND s_region = 'AMERICA' AND p_mfgr in ( 'MFGR#1' , 'MFGR#2') 
GROUP BY year, c_nation 
ORDER BY year ASC, c_nation ASC; 
 
--Q4.2 
SELECT year(lo_orderdate) AS year, 
    s_nation, p_category, sum(lo_revenue - lo_supplycost) AS profit 
FROM lineorder_flat 
WHERE c_region = 'AMERICA' AND s_region = 'AMERICA' AND lo_orderdate >= '1997-01-01' and lo_orderdate <= '1998-12-31' AND  p_mfgr in ( 'MFGR#1' , 'MFGR#2') 
GROUP BY year, s_nation,  p_category 
ORDER BY  year ASC, s_nation ASC, p_category ASC; 
 
--Q4.3 
SELECT year(lo_orderdate) AS year, s_city, p_brand, 
    sum(lo_revenue - lo_supplycost) AS profit 
FROM lineorder_flat 
WHERE s_nation = 'UNITED STATES' AND lo_orderdate >= '1997-01-01' and lo_orderdate <= '1998-12-31' AND p_category = 'MFGR#14' 
GROUP BY  year,  s_city, p_brand 
ORDER BY year ASC,  s_city ASC,  p_brand ASC; 

多表測試SQL

--Q1.1 
select sum(lo_revenue) as revenue
from lineorder join dates on lo_orderdate = d_datekey
where d_year = 1993 and lo_discount between 1 and 3 and lo_quantity < 25;

--Q1.2
select sum(lo_revenue) as revenue
from lineorder
join dates on lo_orderdate = d_datekey
where d_yearmonthnum = 199401
and lo_discount between 4 and 6
and lo_quantity between 26 and 35;

--Q1.3
select sum(lo_revenue) as revenue
from lineorder
join dates on lo_orderdate = d_datekey
where d_weeknuminyear = 6 and d_year = 1994
and lo_discount between 5 and 7
and lo_quantity between 26 and 35;


--Q2.1
select sum(lo_revenue) as lo_revenue, d_year, p_brand
from lineorder
inner join dates on lo_orderdate = d_datekey
join part on lo_partkey = p_partkey
join supplier on lo_suppkey = s_suppkey
where p_category = 'MFGR#12' and s_region = 'AMERICA'
group by d_year, p_brand
order by d_year, p_brand;

--Q2.2
select sum(lo_revenue) as lo_revenue, d_year, p_brand
from lineorder
join dates on lo_orderdate = d_datekey
join part on lo_partkey = p_partkey
join supplier on lo_suppkey = s_suppkey
where p_brand between 'MFGR#2221' and 'MFGR#2228' and s_region = 'ASIA'
group by d_year, p_brand
order by d_year, p_brand;

--Q2.3
select sum(lo_revenue) as lo_revenue, d_year, p_brand
from lineorder
join dates on lo_orderdate = d_datekey
join part on lo_partkey = p_partkey
join supplier on lo_suppkey = s_suppkey
where p_brand = 'MFGR#2239' and s_region = 'EUROPE'
group by d_year, p_brand
order by d_year, p_brand;


--Q3.1
select c_nation, s_nation, d_year, sum(lo_revenue) as lo_revenue
from lineorder
join dates on lo_orderdate = d_datekey
join customer on lo_custkey = c_custkey
join supplier on lo_suppkey = s_suppkey
where c_region = 'ASIA' and s_region = 'ASIA'and d_year >= 1992 and d_year <= 1997
group by c_nation, s_nation, d_year
order by d_year asc, lo_revenue desc;

--Q3.2
select c_city, s_city, d_year, sum(lo_revenue) as lo_revenue
from lineorder
join dates on lo_orderdate = d_datekey
join customer on lo_custkey = c_custkey
join supplier on lo_suppkey = s_suppkey
where c_nation = 'UNITED STATES' and s_nation = 'UNITED STATES'
and d_year >= 1992 and d_year <= 1997
group by c_city, s_city, d_year
order by d_year asc, lo_revenue desc;

--Q3.3
select c_city, s_city, d_year, sum(lo_revenue) as lo_revenue
from lineorder
join dates on lo_orderdate = d_datekey
join customer on lo_custkey = c_custkey
join supplier on lo_suppkey = s_suppkey
where (c_city='UNITED KI1' or c_city='UNITED KI5')
and (s_city='UNITED KI1' or s_city='UNITED KI5')
and d_year >= 1992 and d_year <= 1997
group by c_city, s_city, d_year
order by d_year asc, lo_revenue desc;

--Q3.4
select c_city, s_city, d_year, sum(lo_revenue) as lo_revenue
from lineorder
join dates on lo_orderdate = d_datekey
join customer on lo_custkey = c_custkey
join supplier on lo_suppkey = s_suppkey
where (c_city='UNITED KI1' or c_city='UNITED KI5') and (s_city='UNITED KI1' or s_city='UNITED KI5') and d_yearmonth
 = 'Dec1997'
group by c_city, s_city, d_year
order by d_year asc, lo_revenue desc;


--Q4.1
select d_year, c_nation, sum(lo_revenue) - sum(lo_supplycost) as profit
from lineorder
join dates on lo_orderdate = d_datekey
join customer on lo_custkey = c_custkey
join supplier on lo_suppkey = s_suppkey
join part on lo_partkey = p_partkey
where c_region = 'AMERICA' and s_region = 'AMERICA' and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2')
group by d_year, c_nation
order by d_year, c_nation;

--Q4.2
select d_year, s_nation, p_category, sum(lo_revenue) - sum(lo_supplycost) as profit
from lineorder
join dates on lo_orderdate = d_datekey
join customer on lo_custkey = c_custkey
join supplier on lo_suppkey = s_suppkey
join part on lo_partkey = p_partkey
where c_region = 'AMERICA'and s_region = 'AMERICA'
and (d_year = 1997 or d_year = 1998)
and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2')
group by d_year, s_nation, p_category
order by d_year, s_nation, p_category;

--Q4.3
select d_year, s_city, p_brand, sum(lo_revenue) - sum(lo_supplycost) as profit
from lineorder
join dates on lo_orderdate = d_datekey
join customer on lo_custkey = c_custkey
join supplier on lo_suppkey = s_suppkey
join part on lo_partkey = p_partkey
where c_region = 'AMERICA'and s_nation = 'UNITED STATES'
and (d_year = 1997 or d_year = 1998)
and p_category = 'MFGR#14'
group by d_year, s_city, p_brand
order by d_year, s_city, p_brand;


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM