第一章 數倉搭建-ODS層
1)保持數據原貌不做任何修改,起到備份數據的作用。
2)數據采用LZO壓縮,減少磁盤存儲空間。100G數據可以壓縮到10G以內。
3)創建分區表,防止后續的全表掃描,在企業開發中大量使用分區表。
4)創建外部表。在企業開發中,除了自己用的臨時表,創建內部表外,絕大多數場景都是創建外部表。外部表只創建表與原始數據之間的映射關系,而不改變數據的位置,在對表執行刪除操作時,只會刪除表的元數據,而不會刪除表的數據。相對來說更安全,這種方式在實際工作環境中應用十分廣泛。
1.1 用戶行為數據
1.1.1 創建日志表ods_log
1)創建支持LZO壓縮的分區表
(1)建表語句
輸入數據是LZO壓縮格式、輸出數據是TEXT存儲格式、支持JSON解析的分區啟動日志表
hive (gmall)>
drop table if exists ods_log;
CREATE EXTERNAL TABLE ods_log (`line` string)
PARTITIONED BY (`dt` string) -- 按照時間創建分區
STORED AS -- 指定存儲方式,讀數據采用LzoTextInputFormat;
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_log' -- 指定數據在hdfs上的存儲位置
;
說明Hive的LZO壓縮:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO
(2)分區規划

2)加載數據,指定每天數據的分區信息為具體到日的日期

hive (gmall)>
load data inpath '/origin_data/gmall/log/topic_log/2020-06-14' into table ods_log partition(dt='2020-06-14');
注意:時間格式都配置成YYYY-MM-DD格式,這是Hive默認支持的時間格式
3)為LZO壓縮文件創建索引
文件輸入格式為LZO壓縮格式,由於LZO壓縮格式的文件不支持HDFS對其進行分片,因此需要對LZO壓縮格式的文件創建索引。
[atguigu@hadoop102 bin]$ hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/common/hadoop-lzo-0.4.20.jar com.hadoop.compression.lzo.DistributedLzoIndexer /warehouse/gmall/ods/ods_log/dt=2020-06-14
1.1.2 Shell中單引號和雙引號區別
1)在/home/atguigu/bin創建一個test.sh文件
[atguigu@hadoop102 bin]$ vim test.sh
在文件中添加如下內容
#!/bin/bash
do_date=$1
echo '$do_date'
echo "$do_date"
echo "'$do_date'"
echo '"$do_date"'
echo `date`
2)查看執行結果
[atguigu@hadoop102 bin]$ test.sh 2020-06-14
$do_date
2020-06-14
'2020-06-14'
"$do_date"
2020年 06月 18日 星期四 21:02:08 CST
3)總結:
(1)單引號不取變量值
(2)雙引號取變量值
(3)雙引號內部嵌套單引號,取出變量值
(4)單引號內部嵌套雙引號,不取出變量值
(5)反引號`,執行引號中命令
1.1.3 ODS層日志表加載數據腳本
1)編寫腳本
(1)在hadoop102的/home/atguigu/bin目錄下創建腳本
[atguigu@hadoop102 bin]$ vim hdfs_to_ods_log.sh
在腳本中編寫如下內容
#!/bin/bash
# 定義變量方便修改
APP=gmall
# 如果是輸入的日期按照取輸入日期;如果沒輸入日期取當前時間的前一天
if [ -n "$1" ] ;then
do_date=$1
else
do_date=`date -d "-1 day" +%F`
fi
echo "================== 日志日期為 $do_date =================="
sql="
load data inpath '/origin_data/$APP/log/topic_log/$do_date' into table ${APP}.ods_log partition(dt='$do_date');
"
hive -e "$sql"
hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/common/hadoop-lzo-0.4.20.jar com.hadoop.compression.lzo.DistributedLzoIndexer /warehouse/$APP/ods/ods_log/dt=$do_date
(1)說明1:
[ -n 變量值 ] 判斷變量的值,是否為空
-
變量的值,非空,返回true
-
變量的值,為空,返回false
注意:[ -n 變量值 ]不會解析數據,使用[ -n 變量值 ]時,需要對變量加上雙引號(" ")
(2)說明2:
查看date命令的使用,date --help
(3)增加腳本執行權限
[atguigu@hadoop102 bin]$ chmod 777 hdfs_to_ods_log.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 module]$ hdfs_to_ods_log.sh 2020-06-14
(2)查看導入數據
1.2 業務數據
業務數據的ODS層搭建與用戶行為數據的ODS層搭建相同,都是保留原始數據,不對數據進行任何轉換處理,根據需求分析選取業務數據庫中的表的必須字段進行建表,然后將Sqoop導入的原始數據加載(Load)至所建表格中。
ODS層業務表分區規划如下

ODS層業務表數據裝載思路如下

1.2.1 活動信息表
DROP TABLE IF EXISTS ods_activity_info;
CREATE EXTERNAL TABLE ods_activity_info(
`id` STRING COMMENT '編號',
`activity_name` STRING COMMENT '活動名稱',
`activity_type` STRING COMMENT '活動類型',
`start_time` STRING COMMENT '開始時間',
`end_time` STRING COMMENT '結束時間',
`create_time` STRING COMMENT '創建時間'
) COMMENT '活動信息表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_activity_info/';
1.2.2 活動規則表
DROP TABLE IF EXISTS ods_activity_rule;
CREATE EXTERNAL TABLE ods_activity_rule(
`id` STRING COMMENT '編號',
`activity_id` STRING COMMENT '活動ID',
`activity_type` STRING COMMENT '活動類型',
`condition_amount` DECIMAL(16,2) COMMENT '滿減金額',
`condition_num` BIGINT COMMENT '滿減件數',
`benefit_amount` DECIMAL(16,2) COMMENT '優惠金額',
`benefit_discount` DECIMAL(16,2) COMMENT '優惠折扣',
`benefit_level` STRING COMMENT '優惠級別'
) COMMENT '活動規則表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_activity_rule/';
1.2.3 一級品類表
DROP TABLE IF EXISTS ods_base_category1;
CREATE EXTERNAL TABLE ods_base_category1(
`id` STRING COMMENT 'id',
`name` STRING COMMENT '名稱'
) COMMENT '商品一級分類表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_base_category1/';
1.2.4 二級品類表
DROP TABLE IF EXISTS ods_base_category2;
CREATE EXTERNAL TABLE ods_base_category2(
`id` STRING COMMENT ' id',
`name` STRING COMMENT '名稱',
`category1_id` STRING COMMENT '一級品類id'
) COMMENT '商品二級分類表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_base_category2/';
1.2.5 三級品類表
DROP TABLE IF EXISTS ods_base_category3;
CREATE EXTERNAL TABLE ods_base_category3(
`id` STRING COMMENT ' id',
`name` STRING COMMENT '名稱',
`category2_id` STRING COMMENT '二級品類id'
) COMMENT '商品三級分類表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_base_category3/';
1.2.6 編碼字典表
DROP TABLE IF EXISTS ods_base_dic;
CREATE EXTERNAL TABLE ods_base_dic(
`dic_code` STRING COMMENT '編號',
`dic_name` STRING COMMENT '編碼名稱',
`parent_code` STRING COMMENT '父編碼',
`create_time` STRING COMMENT '創建日期',
`operate_time` STRING COMMENT '操作日期'
) COMMENT '編碼字典表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_base_dic/';
1.2.7 省份表
DROP TABLE IF EXISTS ods_base_province;
CREATE EXTERNAL TABLE ods_base_province (
`id` STRING COMMENT '編號',
`name` STRING COMMENT '省份名稱',
`region_id` STRING COMMENT '地區ID',
`area_code` STRING COMMENT '地區編碼',
`iso_code` STRING COMMENT 'ISO-3166編碼,供可視化使用',
`iso_3166_2` STRING COMMENT 'IOS-3166-2編碼,供可視化使用'
) COMMENT '省份表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_base_province/';
1.2.8 地區表
DROP TABLE IF EXISTS ods_base_region;
CREATE EXTERNAL TABLE ods_base_region (
`id` STRING COMMENT '編號',
`region_name` STRING COMMENT '地區名稱'
) COMMENT '地區表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_base_region/';
1.2.9 品牌表
DROP TABLE IF EXISTS ods_base_trademark;
CREATE EXTERNAL TABLE ods_base_trademark (
`id` STRING COMMENT '編號',
`tm_name` STRING COMMENT '品牌名稱'
) COMMENT '品牌表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_base_trademark/';
1.2.10 購物車表
DROP TABLE IF EXISTS ods_cart_info;
CREATE EXTERNAL TABLE ods_cart_info(
`id` STRING COMMENT '編號',
`user_id` STRING COMMENT '用戶id',
`sku_id` STRING COMMENT 'skuid',
`cart_price` DECIMAL(16,2) COMMENT '放入購物車時價格',
`sku_num` BIGINT COMMENT '數量',
`sku_name` STRING COMMENT 'sku名稱 (冗余)',
`create_time` STRING COMMENT '創建時間',
`operate_time` STRING COMMENT '修改時間',
`is_ordered` STRING COMMENT '是否已經下單',
`order_time` STRING COMMENT '下單時間',
`source_type` STRING COMMENT '來源類型',
`source_id` STRING COMMENT '來源編號'
) COMMENT '加購表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_cart_info/';
1.2.11 評論表
DROP TABLE IF EXISTS ods_comment_info;
CREATE EXTERNAL TABLE ods_comment_info(
`id` STRING COMMENT '編號',
`user_id` STRING COMMENT '用戶ID',
`sku_id` STRING COMMENT '商品sku',
`spu_id` STRING COMMENT '商品spu',
`order_id` STRING COMMENT '訂單ID',
`appraise` STRING COMMENT '評價',
`create_time` STRING COMMENT '評價時間'
) COMMENT '商品評論表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_comment_info/';
1.2.12 優惠券信息表
DROP TABLE IF EXISTS ods_coupon_info;
CREATE EXTERNAL TABLE ods_coupon_info(
`id` STRING COMMENT '購物券編號',
`coupon_name` STRING COMMENT '購物券名稱',
`coupon_type` STRING COMMENT '購物券類型 1 現金券 2 折扣券 3 滿減券 4 滿件打折券',
`condition_amount` DECIMAL(16,2) COMMENT '滿額數',
`condition_num` BIGINT COMMENT '滿件數',
`activity_id` STRING COMMENT '活動編號',
`benefit_amount` DECIMAL(16,2) COMMENT '減金額',
`benefit_discount` DECIMAL(16,2) COMMENT '折扣',
`create_time` STRING COMMENT '創建時間',
`range_type` STRING COMMENT '范圍類型 1、商品 2、品類 3、品牌',
`limit_num` BIGINT COMMENT '最多領用次數',
`taken_count` BIGINT COMMENT '已領用次數',
`start_time` STRING COMMENT '開始領取時間',
`end_time` STRING COMMENT '結束領取時間',
`operate_time` STRING COMMENT '修改時間',
`expire_time` STRING COMMENT '過期時間'
) COMMENT '優惠券表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_coupon_info/';
1.2.13 優惠券領用表
DROP TABLE IF EXISTS ods_coupon_use;
CREATE EXTERNAL TABLE ods_coupon_use(
`id` STRING COMMENT '編號',
`coupon_id` STRING COMMENT '優惠券ID',
`user_id` STRING COMMENT 'skuid',
`order_id` STRING COMMENT 'spuid',
`coupon_status` STRING COMMENT '優惠券狀態',
`get_time` STRING COMMENT '領取時間',
`using_time` STRING COMMENT '使用時間(下單)',
`used_time` STRING COMMENT '使用時間(支付)',
`expire_time` STRING COMMENT '過期時間'
) COMMENT '優惠券領用表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_coupon_use/';
1.2.14 收藏表
DROP TABLE IF EXISTS ods_favor_info;
CREATE EXTERNAL TABLE ods_favor_info(
`id` STRING COMMENT '編號',
`user_id` STRING COMMENT '用戶id',
`sku_id` STRING COMMENT 'skuid',
`spu_id` STRING COMMENT 'spuid',
`is_cancel` STRING COMMENT '是否取消',
`create_time` STRING COMMENT '收藏時間',
`cancel_time` STRING COMMENT '取消時間'
) COMMENT '商品收藏表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_favor_info/';
1.2.15 訂單明細表
DROP TABLE IF EXISTS ods_order_detail;
CREATE EXTERNAL TABLE ods_order_detail(
`id` STRING COMMENT '編號',
`order_id` STRING COMMENT '訂單號',
`sku_id` STRING COMMENT '商品id',
`sku_name` STRING COMMENT '商品名稱',
`order_price` DECIMAL(16,2) COMMENT '商品價格',
`sku_num` BIGINT COMMENT '商品數量',
`create_time` STRING COMMENT '創建時間',
`source_type` STRING COMMENT '來源類型',
`source_id` STRING COMMENT '來源編號',
`split_final_amount` DECIMAL(16,2) COMMENT '分攤最終金額',
`split_activity_amount` DECIMAL(16,2) COMMENT '分攤活動優惠',
`split_coupon_amount` DECIMAL(16,2) COMMENT '分攤優惠券優惠'
) COMMENT '訂單詳情表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_order_detail/';
1.2.16 訂單明細活動關聯表
DROP TABLE IF EXISTS ods_order_detail_activity;
CREATE EXTERNAL TABLE ods_order_detail_activity(
`id` STRING COMMENT '編號',
`order_id` STRING COMMENT '訂單號',
`order_detail_id` STRING COMMENT '訂單明細id',
`activity_id` STRING COMMENT '活動id',
`activity_rule_id` STRING COMMENT '活動規則id',
`sku_id` BIGINT COMMENT '商品id',
`create_time` STRING COMMENT '創建時間'
) COMMENT '訂單詳情活動關聯表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_order_detail_activity/';
1.2.17 訂單明細優惠券關聯表
DROP TABLE IF EXISTS ods_order_detail_coupon;
CREATE EXTERNAL TABLE ods_order_detail_coupon(
`id` STRING COMMENT '編號',
`order_id` STRING COMMENT '訂單號',
`order_detail_id` STRING COMMENT '訂單明細id',
`coupon_id` STRING COMMENT '優惠券id',
`coupon_use_id` STRING COMMENT '優惠券領用記錄id',
`sku_id` STRING COMMENT '商品id',
`create_time` STRING COMMENT '創建時間'
) COMMENT '訂單詳情活動關聯表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_order_detail_coupon/';
1.2.18 訂單表
DROP TABLE IF EXISTS ods_order_info;
CREATE EXTERNAL TABLE ods_order_info (
`id` STRING COMMENT '訂單號',
`final_amount` DECIMAL(16,2) COMMENT '訂單最終金額',
`order_status` STRING COMMENT '訂單狀態',
`user_id` STRING COMMENT '用戶id',
`payment_way` STRING COMMENT '支付方式',
`delivery_address` STRING COMMENT '送貨地址',
`out_trade_no` STRING COMMENT '支付流水號',
`create_time` STRING COMMENT '創建時間',
`operate_time` STRING COMMENT '操作時間',
`expire_time` STRING COMMENT '過期時間',
`tracking_no` STRING COMMENT '物流單編號',
`province_id` STRING COMMENT '省份ID',
`activity_reduce_amount` DECIMAL(16,2) COMMENT '活動減免金額',
`coupon_reduce_amount` DECIMAL(16,2) COMMENT '優惠券減免金額',
`original_amount` DECIMAL(16,2) COMMENT '訂單原價金額',
`feight_fee` DECIMAL(16,2) COMMENT '運費',
`feight_fee_reduce` DECIMAL(16,2) COMMENT '運費減免'
) COMMENT '訂單表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_order_info/';
1.2.19 退單表
DROP TABLE IF EXISTS ods_order_refund_info;
CREATE EXTERNAL TABLE ods_order_refund_info(
`id` STRING COMMENT '編號',
`user_id` STRING COMMENT '用戶ID',
`order_id` STRING COMMENT '訂單ID',
`sku_id` STRING COMMENT '商品ID',
`refund_type` STRING COMMENT '退單類型',
`refund_num` BIGINT COMMENT '退單件數',
`refund_amount` DECIMAL(16,2) COMMENT '退單金額',
`refund_reason_type` STRING COMMENT '退單原因類型',
`refund_status` STRING COMMENT '退單狀態',--退單狀態應包含買家申請、賣家審核、賣家收貨、退款完成等狀態。此處未涉及到,故該表按增量處理
`create_time` STRING COMMENT '退單時間'
) COMMENT '退單表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_order_refund_info/';
1.2.20 訂單狀態日志表
DROP TABLE IF EXISTS ods_order_status_log;
CREATE EXTERNAL TABLE ods_order_status_log (
`id` STRING COMMENT '編號',
`order_id` STRING COMMENT '訂單ID',
`order_status` STRING COMMENT '訂單狀態',
`operate_time` STRING COMMENT '修改時間'
) COMMENT '訂單狀態表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_order_status_log/';
1.2.21 支付表
DROP TABLE IF EXISTS ods_payment_info;
CREATE EXTERNAL TABLE ods_payment_info(
`id` STRING COMMENT '編號',
`out_trade_no` STRING COMMENT '對外業務編號',
`order_id` STRING COMMENT '訂單編號',
`user_id` STRING COMMENT '用戶編號',
`payment_type` STRING COMMENT '支付類型',
`trade_no` STRING COMMENT '交易編號',
`payment_amount` DECIMAL(16,2) COMMENT '支付金額',
`subject` STRING COMMENT '交易內容',
`payment_status` STRING COMMENT '支付狀態',
`create_time` STRING COMMENT '創建時間',
`callback_time` STRING COMMENT '回調時間'
) COMMENT '支付流水表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_payment_info/';
1.2.22 退款表
DROP TABLE IF EXISTS ods_refund_payment;
CREATE EXTERNAL TABLE ods_refund_payment(
`id` STRING COMMENT '編號',
`out_trade_no` STRING COMMENT '對外業務編號',
`order_id` STRING COMMENT '訂單編號',
`sku_id` STRING COMMENT 'SKU編號',
`payment_type` STRING COMMENT '支付類型',
`trade_no` STRING COMMENT '交易編號',
`refund_amount` DECIMAL(16,2) COMMENT '支付金額',
`subject` STRING COMMENT '交易內容',
`refund_status` STRING COMMENT '支付狀態',
`create_time` STRING COMMENT '創建時間',
`callback_time` STRING COMMENT '回調時間'
) COMMENT '支付流水表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_refund_payment/';
1.2.23 商品平台屬性表
DROP TABLE IF EXISTS ods_sku_attr_value;
CREATE EXTERNAL TABLE ods_sku_attr_value(
`id` STRING COMMENT '編號',
`attr_id` STRING COMMENT '平台屬性ID',
`value_id` STRING COMMENT '平台屬性值ID',
`sku_id` STRING COMMENT '商品ID',
`attr_name` STRING COMMENT '平台屬性名稱',
`value_name` STRING COMMENT '平台屬性值名稱'
) COMMENT 'sku平台屬性表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_sku_attr_value/';
1.2.24 商品(SKU)表
DROP TABLE IF EXISTS ods_sku_info;
CREATE EXTERNAL TABLE ods_sku_info(
`id` STRING COMMENT 'skuId',
`spu_id` STRING COMMENT 'spuid',
`price` DECIMAL(16,2) COMMENT '價格',
`sku_name` STRING COMMENT '商品名稱',
`sku_desc` STRING COMMENT '商品描述',
`weight` DECIMAL(16,2) COMMENT '重量',
`tm_id` STRING COMMENT '品牌id',
`category3_id` STRING COMMENT '品類id',
`is_sale` STRING COMMENT '是否在售',
`create_time` STRING COMMENT '創建時間'
) COMMENT 'SKU商品表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_sku_info/';
1.2.25 商品銷售屬性表
DROP TABLE IF EXISTS ods_sku_sale_attr_value;
CREATE EXTERNAL TABLE ods_sku_sale_attr_value(
`id` STRING COMMENT '編號',
`sku_id` STRING COMMENT 'sku_id',
`spu_id` STRING COMMENT 'spu_id',
`sale_attr_value_id` STRING COMMENT '銷售屬性值id',
`sale_attr_id` STRING COMMENT '銷售屬性id',
`sale_attr_name` STRING COMMENT '銷售屬性名稱',
`sale_attr_value_name` STRING COMMENT '銷售屬性值名稱'
) COMMENT 'sku銷售屬性名稱'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_sku_sale_attr_value/';
1.2.26 商品(SPU)表
DROP TABLE IF EXISTS ods_spu_info;
CREATE EXTERNAL TABLE ods_spu_info(
`id` STRING COMMENT 'spuid',
`spu_name` STRING COMMENT 'spu名稱',
`category3_id` STRING COMMENT '品類id',
`tm_id` STRING COMMENT '品牌id'
) COMMENT 'SPU商品表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_spu_info/';
1.2.27 用戶表
源自的表:
ods_user_info 用戶表
DROP TABLE IF EXISTS ods_user_info;
CREATE EXTERNAL TABLE ods_user_info(
`id` STRING COMMENT '用戶id',
`login_name` STRING COMMENT '用戶名稱',
`nick_name` STRING COMMENT '用戶昵稱',
`name` STRING COMMENT '用戶姓名',
`phone_num` STRING COMMENT '手機號碼',
`email` STRING COMMENT '郵箱',
`user_level` STRING COMMENT '用戶等級',
`birthday` STRING COMMENT '生日',
`gender` STRING COMMENT '性別',
`create_time` STRING COMMENT '創建時間',
`operate_time` STRING COMMENT '操作時間'
) COMMENT '用戶表'
PARTITIONED BY (`dt` STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/warehouse/gmall/ods/ods_user_info/';
1.2.28 ODS層業務表首日數據裝載腳本
1)編寫腳本
(1)在/home/atguigu/bin目錄下創建腳本hdfs_to_ods_db_init.sh
[atguigu@hadoop102 bin]$ vim hdfs_to_ods_db_init.sh
在腳本中填寫如下內容
#!/bin/bash
APP=gmall
if [ -n "$2" ] ;then
do_date=$2
else
echo "請傳入日期參數"
exit
fi
ods_order_info="
load data inpath '/origin_data/$APP/db/order_info/$do_date' OVERWRITE into table ${APP}.ods_order_info partition(dt='$do_date');"
ods_order_detail="
load data inpath '/origin_data/$APP/db/order_detail/$do_date' OVERWRITE into table ${APP}.ods_order_detail partition(dt='$do_date');"
ods_sku_info="
load data inpath '/origin_data/$APP/db/sku_info/$do_date' OVERWRITE into table ${APP}.ods_sku_info partition(dt='$do_date');"
ods_user_info="
load data inpath '/origin_data/$APP/db/user_info/$do_date' OVERWRITE into table ${APP}.ods_user_info partition(dt='$do_date');"
ods_payment_info="
load data inpath '/origin_data/$APP/db/payment_info/$do_date' OVERWRITE into table ${APP}.ods_payment_info partition(dt='$do_date');"
ods_base_category1="
load data inpath '/origin_data/$APP/db/base_category1/$do_date' OVERWRITE into table ${APP}.ods_base_category1 partition(dt='$do_date');"
ods_base_category2="
load data inpath '/origin_data/$APP/db/base_category2/$do_date' OVERWRITE into table ${APP}.ods_base_category2 partition(dt='$do_date');"
ods_base_category3="
load data inpath '/origin_data/$APP/db/base_category3/$do_date' OVERWRITE into table ${APP}.ods_base_category3 partition(dt='$do_date'); "
ods_base_trademark="
load data inpath '/origin_data/$APP/db/base_trademark/$do_date' OVERWRITE into table ${APP}.ods_base_trademark partition(dt='$do_date'); "
ods_activity_info="
load data inpath '/origin_data/$APP/db/activity_info/$do_date' OVERWRITE into table ${APP}.ods_activity_info partition(dt='$do_date'); "
ods_cart_info="
load data inpath '/origin_data/$APP/db/cart_info/$do_date' OVERWRITE into table ${APP}.ods_cart_info partition(dt='$do_date'); "
ods_comment_info="
load data inpath '/origin_data/$APP/db/comment_info/$do_date' OVERWRITE into table ${APP}.ods_comment_info partition(dt='$do_date'); "
ods_coupon_info="
load data inpath '/origin_data/$APP/db/coupon_info/$do_date' OVERWRITE into table ${APP}.ods_coupon_info partition(dt='$do_date'); "
ods_coupon_use="
load data inpath '/origin_data/$APP/db/coupon_use/$do_date' OVERWRITE into table ${APP}.ods_coupon_use partition(dt='$do_date'); "
ods_favor_info="
load data inpath '/origin_data/$APP/db/favor_info/$do_date' OVERWRITE into table ${APP}.ods_favor_info partition(dt='$do_date'); "
ods_order_refund_info="
load data inpath '/origin_data/$APP/db/order_refund_info/$do_date' OVERWRITE into table ${APP}.ods_order_refund_info partition(dt='$do_date'); "
ods_order_status_log="
load data inpath '/origin_data/$APP/db/order_status_log/$do_date' OVERWRITE into table ${APP}.ods_order_status_log partition(dt='$do_date'); "
ods_spu_info="
load data inpath '/origin_data/$APP/db/spu_info/$do_date' OVERWRITE into table ${APP}.ods_spu_info partition(dt='$do_date'); "
ods_activity_rule="
load data inpath '/origin_data/$APP/db/activity_rule/$do_date' OVERWRITE into table ${APP}.ods_activity_rule partition(dt='$do_date');"
ods_base_dic="
load data inpath '/origin_data/$APP/db/base_dic/$do_date' OVERWRITE into table ${APP}.ods_base_dic partition(dt='$do_date'); "
ods_order_detail_activity="
load data inpath '/origin_data/$APP/db/order_detail_activity/$do_date' OVERWRITE into table ${APP}.ods_order_detail_activity partition(dt='$do_date'); "
ods_order_detail_coupon="
load data inpath '/origin_data/$APP/db/order_detail_coupon/$do_date' OVERWRITE into table ${APP}.ods_order_detail_coupon partition(dt='$do_date'); "
ods_refund_payment="
load data inpath '/origin_data/$APP/db/refund_payment/$do_date' OVERWRITE into table ${APP}.ods_refund_payment partition(dt='$do_date'); "
ods_sku_attr_value="
load data inpath '/origin_data/$APP/db/sku_attr_value/$do_date' OVERWRITE into table ${APP}.ods_sku_attr_value partition(dt='$do_date'); "
ods_sku_sale_attr_value="
load data inpath '/origin_data/$APP/db/sku_sale_attr_value/$do_date' OVERWRITE into table ${APP}.ods_sku_sale_attr_value partition(dt='$do_date'); "
ods_base_province="
load data inpath '/origin_data/$APP/db/base_province/$do_date' OVERWRITE into table ${APP}.ods_base_province;"
ods_base_region="
load data inpath '/origin_data/$APP/db/base_region/$do_date' OVERWRITE into table ${APP}.ods_base_region;"
case $1 in
"ods_order_info"){
hive -e "$ods_order_info"
};;
"ods_order_detail"){
hive -e "$ods_order_detail"
};;
"ods_sku_info"){
hive -e "$ods_sku_info"
};;
"ods_user_info"){
hive -e "$ods_user_info"
};;
"ods_payment_info"){
hive -e "$ods_payment_info"
};;
"ods_base_category1"){
hive -e "$ods_base_category1"
};;
"ods_base_category2"){
hive -e "$ods_base_category2"
};;
"ods_base_category3"){
hive -e "$ods_base_category3"
};;
"ods_base_trademark"){
hive -e "$ods_base_trademark"
};;
"ods_activity_info"){
hive -e "$ods_activity_info"
};;
"ods_cart_info"){
hive -e "$ods_cart_info"
};;
"ods_comment_info"){
hive -e "$ods_comment_info"
};;
"ods_coupon_info"){
hive -e "$ods_coupon_info"
};;
"ods_coupon_use"){
hive -e "$ods_coupon_use"
};;
"ods_favor_info"){
hive -e "$ods_favor_info"
};;
"ods_order_refund_info"){
hive -e "$ods_order_refund_info"
};;
"ods_order_status_log"){
hive -e "$ods_order_status_log"
};;
"ods_spu_info"){
hive -e "$ods_spu_info"
};;
"ods_activity_rule"){
hive -e "$ods_activity_rule"
};;
"ods_base_dic"){
hive -e "$ods_base_dic"
};;
"ods_order_detail_activity"){
hive -e "$ods_order_detail_activity"
};;
"ods_order_detail_coupon"){
hive -e "$ods_order_detail_coupon"
};;
"ods_refund_payment"){
hive -e "$ods_refund_payment"
};;
"ods_sku_attr_value"){
hive -e "$ods_sku_attr_value"
};;
"ods_sku_sale_attr_value"){
hive -e "$ods_sku_sale_attr_value"
};;
"ods_base_province"){
hive -e "$ods_base_province"
};;
"ods_base_region"){
hive -e "$ods_base_region"
};;
"all"){
hive -e "$ods_order_info$ods_order_detail$ods_sku_info$ods_user_info$ods_payment_info$ods_base_category1$ods_base_category2$ods_base_category3$ods_base_trademark$ods_activity_info$ods_cart_info$ods_comment_info$ods_coupon_info$ods_coupon_use$ods_favor_info$ods_order_refund_info$ods_order_status_log$ods_spu_info$ods_activity_rule$ods_base_dic$ods_order_detail_activity$ods_order_detail_coupon$ods_refund_payment$ods_sku_attr_value$ods_sku_sale_attr_value$ods_base_province$ods_base_region"
};;
esac
(2)增加執行權限
[atguigu@hadoop102 bin]$ chmod +x hdfs_to_ods_db_init.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 bin]$ hdfs_to_ods_db_init.sh all 2020-06-14
(2)查看數據是否導入成功
1.2.29 ODS層業務表每日數據裝載腳本
1)編寫腳本
(1)在/home/atguigu/bin目錄下創建腳本hdfs_to_ods_db.sh
[atguigu@hadoop102 bin]$ vim hdfs_to_ods_db.sh
在腳本中填寫如下內容
#!/bin/bash
APP=gmall
# 如果是輸入的日期按照取輸入日期;如果沒輸入日期取當前時間的前一天
if [ -n "$2" ] ;then
do_date=$2
else
do_date=`date -d "-1 day" +%F`
fi
ods_order_info="
load data inpath '/origin_data/$APP/db/order_info/$do_date' OVERWRITE into table ${APP}.ods_order_info partition(dt='$do_date');"
ods_order_detail="
load data inpath '/origin_data/$APP/db/order_detail/$do_date' OVERWRITE into table ${APP}.ods_order_detail partition(dt='$do_date');"
ods_sku_info="
load data inpath '/origin_data/$APP/db/sku_info/$do_date' OVERWRITE into table ${APP}.ods_sku_info partition(dt='$do_date');"
ods_user_info="
load data inpath '/origin_data/$APP/db/user_info/$do_date' OVERWRITE into table ${APP}.ods_user_info partition(dt='$do_date');"
ods_payment_info="
load data inpath '/origin_data/$APP/db/payment_info/$do_date' OVERWRITE into table ${APP}.ods_payment_info partition(dt='$do_date');"
ods_base_category1="
load data inpath '/origin_data/$APP/db/base_category1/$do_date' OVERWRITE into table ${APP}.ods_base_category1 partition(dt='$do_date');"
ods_base_category2="
load data inpath '/origin_data/$APP/db/base_category2/$do_date' OVERWRITE into table ${APP}.ods_base_category2 partition(dt='$do_date');"
ods_base_category3="
load data inpath '/origin_data/$APP/db/base_category3/$do_date' OVERWRITE into table ${APP}.ods_base_category3 partition(dt='$do_date'); "
ods_base_trademark="
load data inpath '/origin_data/$APP/db/base_trademark/$do_date' OVERWRITE into table ${APP}.ods_base_trademark partition(dt='$do_date'); "
ods_activity_info="
load data inpath '/origin_data/$APP/db/activity_info/$do_date' OVERWRITE into table ${APP}.ods_activity_info partition(dt='$do_date'); "
ods_cart_info="
load data inpath '/origin_data/$APP/db/cart_info/$do_date' OVERWRITE into table ${APP}.ods_cart_info partition(dt='$do_date'); "
ods_comment_info="
load data inpath '/origin_data/$APP/db/comment_info/$do_date' OVERWRITE into table ${APP}.ods_comment_info partition(dt='$do_date'); "
ods_coupon_info="
load data inpath '/origin_data/$APP/db/coupon_info/$do_date' OVERWRITE into table ${APP}.ods_coupon_info partition(dt='$do_date'); "
ods_coupon_use="
load data inpath '/origin_data/$APP/db/coupon_use/$do_date' OVERWRITE into table ${APP}.ods_coupon_use partition(dt='$do_date'); "
ods_favor_info="
load data inpath '/origin_data/$APP/db/favor_info/$do_date' OVERWRITE into table ${APP}.ods_favor_info partition(dt='$do_date'); "
ods_order_refund_info="
load data inpath '/origin_data/$APP/db/order_refund_info/$do_date' OVERWRITE into table ${APP}.ods_order_refund_info partition(dt='$do_date'); "
ods_order_status_log="
load data inpath '/origin_data/$APP/db/order_status_log/$do_date' OVERWRITE into table ${APP}.ods_order_status_log partition(dt='$do_date'); "
ods_spu_info="
load data inpath '/origin_data/$APP/db/spu_info/$do_date' OVERWRITE into table ${APP}.ods_spu_info partition(dt='$do_date'); "
ods_activity_rule="
load data inpath '/origin_data/$APP/db/activity_rule/$do_date' OVERWRITE into table ${APP}.ods_activity_rule partition(dt='$do_date');"
ods_base_dic="
load data inpath '/origin_data/$APP/db/base_dic/$do_date' OVERWRITE into table ${APP}.ods_base_dic partition(dt='$do_date'); "
ods_order_detail_activity="
load data inpath '/origin_data/$APP/db/order_detail_activity/$do_date' OVERWRITE into table ${APP}.ods_order_detail_activity partition(dt='$do_date'); "
ods_order_detail_coupon="
load data inpath '/origin_data/$APP/db/order_detail_coupon/$do_date' OVERWRITE into table ${APP}.ods_order_detail_coupon partition(dt='$do_date'); "
ods_refund_payment="
load data inpath '/origin_data/$APP/db/refund_payment/$do_date' OVERWRITE into table ${APP}.ods_refund_payment partition(dt='$do_date'); "
ods_sku_attr_value="
load data inpath '/origin_data/$APP/db/sku_attr_value/$do_date' OVERWRITE into table ${APP}.ods_sku_attr_value partition(dt='$do_date'); "
ods_sku_sale_attr_value="
load data inpath '/origin_data/$APP/db/sku_sale_attr_value/$do_date' OVERWRITE into table ${APP}.ods_sku_sale_attr_value partition(dt='$do_date'); "
case $1 in
"ods_order_info"){
hive -e "$ods_order_info"
};;
"ods_order_detail"){
hive -e "$ods_order_detail"
};;
"ods_sku_info"){
hive -e "$ods_sku_info"
};;
"ods_user_info"){
hive -e "$ods_user_info"
};;
"ods_payment_info"){
hive -e "$ods_payment_info"
};;
"ods_base_category1"){
hive -e "$ods_base_category1"
};;
"ods_base_category2"){
hive -e "$ods_base_category2"
};;
"ods_base_category3"){
hive -e "$ods_base_category3"
};;
"ods_base_trademark"){
hive -e "$ods_base_trademark"
};;
"ods_activity_info"){
hive -e "$ods_activity_info"
};;
"ods_cart_info"){
hive -e "$ods_cart_info"
};;
"ods_comment_info"){
hive -e "$ods_comment_info"
};;
"ods_coupon_info"){
hive -e "$ods_coupon_info"
};;
"ods_coupon_use"){
hive -e "$ods_coupon_use"
};;
"ods_favor_info"){
hive -e "$ods_favor_info"
};;
"ods_order_refund_info"){
hive -e "$ods_order_refund_info"
};;
"ods_order_status_log"){
hive -e "$ods_order_status_log"
};;
"ods_spu_info"){
hive -e "$ods_spu_info"
};;
"ods_activity_rule"){
hive -e "$ods_activity_rule"
};;
"ods_base_dic"){
hive -e "$ods_base_dic"
};;
"ods_order_detail_activity"){
hive -e "$ods_order_detail_activity"
};;
"ods_order_detail_coupon"){
hive -e "$ods_order_detail_coupon"
};;
"ods_refund_payment"){
hive -e "$ods_refund_payment"
};;
"ods_sku_attr_value"){
hive -e "$ods_sku_attr_value"
};;
"ods_sku_sale_attr_value"){
hive -e "$ods_sku_sale_attr_value"
};;
"all"){
hive -e "$ods_order_info$ods_order_detail$ods_sku_info$ods_user_info$ods_payment_info$ods_base_category1$ods_base_category2$ods_base_category3$ods_base_trademark$ods_activity_info$ods_cart_info$ods_comment_info$ods_coupon_info$ods_coupon_use$ods_favor_info$ods_order_refund_info$ods_order_status_log$ods_spu_info$ods_activity_rule$ods_base_dic$ods_order_detail_activity$ods_order_detail_coupon$ods_refund_payment$ods_sku_attr_value$ods_sku_sale_attr_value"
};;
esac
(2)修改權限
[atguigu@hadoop102 bin]$ chmod +x hdfs_to_ods_db.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 bin]$ hdfs_to_ods_db.sh all 2020-06-14
(2)查看數據是否導入成功
第二章 數倉搭建-DIM層
關於業務數據,DID層的搭建主要需要關注維度的退化,ODS層的業務數據有二十多張表,形成了比較復雜的關系模型,這種情況下想要獲得一些細節維度的信息,通常需要進行多表join才能得到,為了使查詢更加方便,也避免進行大量的表join計算,需要將關系模型進行適當的維度退化。
2.1 商品維度表(全量)
源自的表:
ods_sku_info 商品(SKU)表
ods_spu_info 商品銷售屬性表
ods_base_category3 三級品類表
ods_base_category2 二級品類表
ods_base_category1 一級品類表
ods_base_trademark 品牌表
ods_sku_attr_value 商品平台屬性表
ods_sku_sale_attr_value 商品平台屬性表
1.建表語句
DROP TABLE IF EXISTS dim_sku_info;
CREATE EXTERNAL TABLE dim_sku_info (
`id` STRING COMMENT '商品id',
`price` DECIMAL(16,2) COMMENT '商品價格',
`sku_name` STRING COMMENT '商品名稱',
`sku_desc` STRING COMMENT '商品描述',
`weight` DECIMAL(16,2) COMMENT '重量',
`is_sale` BOOLEAN COMMENT '是否在售',
`spu_id` STRING COMMENT 'spu編號',
`spu_name` STRING COMMENT 'spu名稱',
`category3_id` STRING COMMENT '三級分類id',
`category3_name` STRING COMMENT '三級分類名稱',
`category2_id` STRING COMMENT '二級分類id',
`category2_name` STRING COMMENT '二級分類名稱',
`category1_id` STRING COMMENT '一級分類id',
`category1_name` STRING COMMENT '一級分類名稱',
`tm_id` STRING COMMENT '品牌id',
`tm_name` STRING COMMENT '品牌名稱',
`sku_attr_values` ARRAY<STRUCT<attr_id:STRING,value_id:STRING,attr_name:STRING,value_name:STRING>> COMMENT '平台屬性',
`sku_sale_attr_values` ARRAY<STRUCT<sale_attr_id:STRING,sale_attr_value_id:STRING,sale_attr_name:STRING,sale_attr_value_name:STRING>> COMMENT '銷售屬性',
`create_time` STRING COMMENT '創建時間'
) COMMENT '商品維度表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dim/dim_sku_info/'
TBLPROPERTIES ("parquet.compression"="lzo");
基本單位是SKU,平台屬性和銷售屬性是結構體數組
2.分區規划

3.數據裝載

1)Hive讀取索引文件問題
(1)兩種方式,分別查詢數據有多少行
hive (gmall)> select * from ods_log;
Time taken: 0.706 seconds, Fetched: 2955 row(s)
hive (gmall)> select count(*) from ods_log;
2959
(2)兩次查詢結果不一致。
原因是select * from ods_log不執行MR操作,直接采用的是ods_log建表語句中指定的DeprecatedLzoTextInputFormat,能夠識別lzo.index為索引文件。
select count(*) from ods_log執行MR操作,會先經過hive.input.format,其默認值為CombineHiveInputFormat,其會先將索引文件當成小文件合並,將其當做普通文件處理。
更嚴重的是,這會導致LZO文件無法切片。
解決辦法:修改CombineHiveInputFormat為HiveInputFormat
hive (gmall)>
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
2)首日裝載
with
sku as
(
select
id,
price,
sku_name,
sku_desc,
weight,
is_sale,
spu_id,
category3_id,
tm_id,
create_time
from ods_sku_info
where dt='2020-06-14'
),
spu as
(
select
id,
spu_name
from ods_spu_info
where dt='2020-06-14'
),
c3 as
(
select
id,
name,
category2_id
from ods_base_category3
where dt='2020-06-14'
),
c2 as
(
select
id,
name,
category1_id
from ods_base_category2
where dt='2020-06-14'
),
c1 as
(
select
id,
name
from ods_base_category1
where dt='2020-06-14'
),
tm as
(
select
id,
tm_name
from ods_base_trademark
where dt='2020-06-14'
),
attr as
(
select
sku_id,
collect_set(named_struct('attr_id',attr_id,'value_id',value_id,'attr_name',attr_name,'value_name',value_name)) attrs
from ods_sku_attr_value
where dt='2020-06-14'
group by sku_id
),
sale_attr as
(
select
sku_id,
collect_set(named_struct('sale_attr_id',sale_attr_id,'sale_attr_value_id',sale_attr_value_id,'sale_attr_name',sale_attr_name,'sale_attr_value_name',sale_attr_value_name)) sale_attrs
from ods_sku_sale_attr_value
where dt='2020-06-14'
group by sku_id
)
insert overwrite table dim_sku_info partition(dt='2020-06-14')
select
sku.id,
sku.price,
sku.sku_name,
sku.sku_desc,
sku.weight,
sku.is_sale,
sku.spu_id,
spu.spu_name,
sku.category3_id,
c3.name,
c3.category2_id,
c2.name,
c2.category1_id,
c1.name,
sku.tm_id,
tm.tm_name,
attr.attrs,
sale_attr.sale_attrs,
sku.create_time
from sku
left join spu on sku.spu_id=spu.id
left join c3 on sku.category3_id=c3.id
left join c2 on c3.category2_id=c2.id
left join c1 on c2.category1_id=c1.id
left join tm on sku.tm_id=tm.id
left join attr on sku.id=attr.sku_id
left join sale_attr on sku.id=sale_attr.sku_id;
left join表示join時以左表的全部數據為准,右邊與之關聯
3)每日裝載
with
sku as
(
select
id,
price,
sku_name,
sku_desc,
weight,
is_sale,
spu_id,
category3_id,
tm_id,
create_time
from ods_sku_info
where dt='2020-06-15'
),
spu as
(
select
id,
spu_name
from ods_spu_info
where dt='2020-06-15'
),
c3 as
(
select
id,
name,
category2_id
from ods_base_category3
where dt='2020-06-15'
),
c2 as
(
select
id,
name,
category1_id
from ods_base_category2
where dt='2020-06-15'
),
c1 as
(
select
id,
name
from ods_base_category1
where dt='2020-06-15'
),
tm as
(
select
id,
tm_name
from ods_base_trademark
where dt='2020-06-15'
),
attr as
(
select
sku_id,
collect_set(named_struct('attr_id',attr_id,'value_id',value_id,'attr_name',attr_name,'value_name',value_name)) attrs
from ods_sku_attr_value
where dt='2020-06-15'
group by sku_id
),
sale_attr as
(
select
sku_id,
collect_set(named_struct('sale_attr_id',sale_attr_id,'sale_attr_value_id',sale_attr_value_id,'sale_attr_name',sale_attr_name,'sale_attr_value_name',sale_attr_value_name)) sale_attrs
from ods_sku_sale_attr_value
where dt='2020-06-15'
group by sku_id
)
insert overwrite table dim_sku_info partition(dt='2020-06-15')
select
sku.id,
sku.price,
sku.sku_name,
sku.sku_desc,
sku.weight,
sku.is_sale,
sku.spu_id,
spu.spu_name,
sku.category3_id,
c3.name,
c3.category2_id,
c2.name,
c2.category1_id,
c1.name,
sku.tm_id,
tm.tm_name,
attr.attrs,
sale_attr.sale_attrs,
sku.create_time
from sku
left join spu on sku.spu_id=spu.id
left join c3 on sku.category3_id=c3.id
left join c2 on c3.category2_id=c2.id
left join c1 on c2.category1_id=c1.id
left join tm on sku.tm_id=tm.id
left join attr on sku.id=attr.sku_id
left join sale_attr on sku.id=sale_attr.sku_id;
2.2 優惠券維度表(全量)
直接源自ods_coupon_info
1.建表語句
DROP TABLE IF EXISTS dim_coupon_info;
CREATE EXTERNAL TABLE dim_coupon_info(
`id` STRING COMMENT '購物券編號',
`coupon_name` STRING COMMENT '購物券名稱',
`coupon_type` STRING COMMENT '購物券類型 1 現金券 2 折扣券 3 滿減券 4 滿件打折券',
`condition_amount` DECIMAL(16,2) COMMENT '滿額數',
`condition_num` BIGINT COMMENT '滿件數',
`activity_id` STRING COMMENT '活動編號',
`benefit_amount` DECIMAL(16,2) COMMENT '減金額',
`benefit_discount` DECIMAL(16,2) COMMENT '折扣',
`create_time` STRING COMMENT '創建時間',
`range_type` STRING COMMENT '范圍類型 1、商品 2、品類 3、品牌',
`limit_num` BIGINT COMMENT '最多領取次數',
`taken_count` BIGINT COMMENT '已領取次數',
`start_time` STRING COMMENT '可以領取的開始日期',
`end_time` STRING COMMENT '可以領取的結束日期',
`operate_time` STRING COMMENT '修改時間',
`expire_time` STRING COMMENT '過期時間'
) COMMENT '優惠券維度表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dim/dim_coupon_info/'
TBLPROPERTIES ("parquet.compression"="lzo");
基本單位是優惠券活動
2.分區規划

3.數據裝載

1)首日裝載
insert overwrite table dim_coupon_info partition(dt='2020-06-14')
select
id,
coupon_name,
coupon_type,
condition_amount,
condition_num,
activity_id,
benefit_amount,
benefit_discount,
create_time,
range_type,
limit_num,
taken_count,
start_time,
end_time,
operate_time,
expire_time
from ods_coupon_info
where dt='2020-06-14';
2)每日裝載
insert overwrite table dim_coupon_info partition(dt='2020-06-15')
select
id,
coupon_name,
coupon_type,
condition_amount,
condition_num,
activity_id,
benefit_amount,
benefit_discount,
create_time,
range_type,
limit_num,
taken_count,
start_time,
end_time,
operate_time,
expire_time
from ods_coupon_info
where dt='2020-06-15';
2.3 活動維度表(全量)
源自的表:
ods_activity_rule 活動信息表
ods_activity_info 活動規則表
1.建表語句
DROP TABLE IF EXISTS dim_activity_rule_info;
CREATE EXTERNAL TABLE dim_activity_rule_info(
`activity_rule_id` STRING COMMENT '活動規則ID',
`activity_id` STRING COMMENT '活動ID',
`activity_name` STRING COMMENT '活動名稱',
`activity_type` STRING COMMENT '活動類型',
`start_time` STRING COMMENT '開始時間',
`end_time` STRING COMMENT '結束時間',
`create_time` STRING COMMENT '創建時間',
`condition_amount` DECIMAL(16,2) COMMENT '滿減金額',
`condition_num` BIGINT COMMENT '滿減件數',
`benefit_amount` DECIMAL(16,2) COMMENT '優惠金額',
`benefit_discount` DECIMAL(16,2) COMMENT '優惠折扣',
`benefit_level` STRING COMMENT '優惠級別'
) COMMENT '活動信息表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dim/dim_activity_rule_info/'
TBLPROPERTIES ("parquet.compression"="lzo");
2.分區規划

3.數據裝載

1)首日裝載
insert overwrite table dim_activity_rule_info partition(dt='2020-06-14')
select
ar.id,
ar.activity_id,
ai.activity_name,
ar.activity_type,
ai.start_time,
ai.end_time,
ai.create_time,
ar.condition_amount,
ar.condition_num,
ar.benefit_amount,
ar.benefit_discount,
ar.benefit_level
from
(
select
id,
activity_id,
activity_type,
condition_amount,
condition_num,
benefit_amount,
benefit_discount,
benefit_level
from ods_activity_rule
where dt='2020-06-14'
)ar
left join
(
select
id,
activity_name,
start_time,
end_time,
create_time
from ods_activity_info
where dt='2020-06-14'
)ai
on ar.activity_id=ai.id;
2)每日轉載
insert overwrite table dim_activity_rule_info partition(dt='2020-06-15')
select
ar.id,
ar.activity_id,
ai.activity_name,
ar.activity_type,
ai.start_time,
ai.end_time,
ai.create_time,
ar.condition_amount,
ar.condition_num,
ar.benefit_amount,
ar.benefit_discount,
ar.benefit_level
from
(
select
id,
activity_id,
activity_type,
condition_amount,
condition_num,
benefit_amount,
benefit_discount,
benefit_level
from ods_activity_rule
where dt='2020-06-15'
)ar
left join
(
select
id,
activity_name,
start_time,
end_time,
create_time
from ods_activity_info
where dt='2020-06-15'
)ai
on ar.activity_id=ai.id;
2.4 地區維度表(特殊)
源自的表:
ods_base_province 省份表
ods_base_region 地區表
1.建表語句
DROP TABLE IF EXISTS dim_base_province;
CREATE EXTERNAL TABLE dim_base_province (
`id` STRING COMMENT 'id',
`province_name` STRING COMMENT '省市名稱',
`area_code` STRING COMMENT '地區編碼',
`iso_code` STRING COMMENT 'ISO-3166編碼,供可視化使用',
`iso_3166_2` STRING COMMENT 'IOS-3166-2編碼,供可視化使用',
`region_id` STRING COMMENT '地區id',
`region_name` STRING COMMENT '地區名稱'
) COMMENT '地區維度表'
STORED AS PARQUET
LOCATION '/warehouse/gmall/dim/dim_base_province/'
TBLPROPERTIES ("parquet.compression"="lzo");
2.數據裝載
地區維度表數據相對穩定,變化概率較低,故無需每日裝載。

insert overwrite table dim_base_province
select
bp.id,
bp.name,
bp.area_code,
bp.iso_code,
bp.iso_3166_2,
bp.region_id,
br.region_name
from ods_base_province bp
join ods_base_region br on bp.region_id = br.id;
2.5 時間維度表(特殊)
1.建表語句
DROP TABLE IF EXISTS dim_date_info;
CREATE EXTERNAL TABLE dim_date_info(
`date_id` STRING COMMENT '日',
`week_id` STRING COMMENT '周ID',
`week_day` STRING COMMENT '周幾',
`day` STRING COMMENT '每月的第幾天',
`month` STRING COMMENT '第幾月',
`quarter` STRING COMMENT '第幾季度',
`year` STRING COMMENT '年',
`is_workday` STRING COMMENT '是否是工作日',
`holiday_id` STRING COMMENT '節假日'
) COMMENT '時間維度表'
STORED AS PARQUET
LOCATION '/warehouse/gmall/dim/dim_date_info/'
TBLPROPERTIES ("parquet.compression"="lzo");
2.數據裝載
通常情況下,時間維度表的數據並不是來自於業務系統,而是手動寫入,並且由於時間維度表數據的可預見性,無須每日導入,一般可一次性導入一年的數據。
1)創建臨時表
DROP TABLE IF EXISTS tmp_dim_date_info;
CREATE EXTERNAL TABLE tmp_dim_date_info (
`date_id` STRING COMMENT '日',
`week_id` STRING COMMENT '周ID',
`week_day` STRING COMMENT '周幾',
`day` STRING COMMENT '每月的第幾天',
`month` STRING COMMENT '第幾月',
`quarter` STRING COMMENT '第幾季度',
`year` STRING COMMENT '年',
`is_workday` STRING COMMENT '是否是工作日',
`holiday_id` STRING COMMENT '節假日'
) COMMENT '時間維度表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/tmp/tmp_dim_date_info/';
2)將數據文件上傳到HFDS上臨時表指定路徑/warehouse/gmall/tmp/tmp_dim_date_info/
hdfs dfs -mkdir -p /warehouse/gmall/tmp/tmp_dim_date_info/
hdfs dfs -put date_info.txt /warehouse/gmall/tmp/tmp_dim_date_info/
3)執行以下語句將其導入時間維度表
insert overwrite table dim_date_info select * from tmp_dim_date_info;
4)檢查數據是否導入成功
select * from dim_date_info;
2.6 用戶維度表(拉鏈表)
2.6.1 拉鏈表概述
1)什么是拉鏈表

2)為什么要做拉鏈表

3)如何使用拉鏈表

4)拉鏈表形成過程

2.6.2 制作拉鏈表
1.建表語句
DROP TABLE IF EXISTS dim_user_info;
CREATE EXTERNAL TABLE dim_user_info(
`id` STRING COMMENT '用戶id',
`login_name` STRING COMMENT '用戶名稱',
`nick_name` STRING COMMENT '用戶昵稱',
`name` STRING COMMENT '用戶姓名',
`phone_num` STRING COMMENT '手機號碼',
`email` STRING COMMENT '郵箱',
`user_level` STRING COMMENT '用戶等級',
`birthday` STRING COMMENT '生日',
`gender` STRING COMMENT '性別',
`create_time` STRING COMMENT '創建時間',
`operate_time` STRING COMMENT '操作時間',
`start_date` STRING COMMENT '開始日期',
`end_date` STRING COMMENT '結束日期'
) COMMENT '用戶表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dim/dim_user_info/'
TBLPROPERTIES ("parquet.compression"="lzo");
2.分區規划

3.數據裝載

1)首日裝載
拉鏈表首日裝載,需要進行初始化操作,具體工作為將截止到初始化當日的全部歷史用戶導入一次性導入到拉鏈表中。目前的ods_user_info表的第一個分區,即2020-06-14分區中就是全部的歷史用戶,故將該分區數據進行一定處理后導入拉鏈表的9999-99-99分區即可。
insert overwrite table dim_user_info partition(dt='9999-99-99')
select
id,
login_name,
nick_name,
md5(name),
md5(phone_num),
md5(email),
user_level,
birthday,
gender,
create_time,
operate_time,
'2020-06-14',
'9999-99-99'
from ods_user_info
where dt='2020-06-14';
2)每日裝載
(1)實現思路

修改過的進入過期狀態,修改結束日期為前一天日期,寫入過期分區,使用動態分區,分別寫入9999-99-99分區和2020-06-14分區。
(2)sql編寫
-- dim前一日的全量最新 與 ods當日新增及變化進行full join得到一張大表
with
tmp as
(
select
old.id old_id,
old.login_name old_login_name,
old.nick_name old_nick_name,
old.name old_name,
old.phone_num old_phone_num,
old.email old_email,
old.user_level old_user_level,
old.birthday old_birthday,
old.gender old_gender,
old.create_time old_create_time,
old.operate_time old_operate_time,
old.start_date old_start_date,
old.end_date old_end_date,
new.id new_id,
new.login_name new_login_name,
new.nick_name new_nick_name,
new.name new_name,
new.phone_num new_phone_num,
new.email new_email,
new.user_level new_user_level,
new.birthday new_birthday,
new.gender new_gender,
new.create_time new_create_time,
new.operate_time new_operate_time,
new.start_date new_start_date,
new.end_date new_end_date
from
( -- dim前一日的全量最新
select
id,
login_name,
nick_name,
name,
phone_num,
email,
user_level,
birthday,
gender,
create_time,
operate_time,
start_date,
end_date
from dim_user_info
where dt='9999-99-99'
)old
full outer join
(
-- ods當日新增及變化
select
id,
login_name,
nick_name,
md5(name) name,
md5(phone_num) phone_num,
md5(email) email,
user_level,
birthday,
gender,
create_time,
operate_time,
'2020-06-15' start_date,
'9999-99-99' end_date
from ods_user_info
where dt='2020-06-15'
)new
on old.id=new.id
)
insert overwrite table dim_user_info partition(dt)
-- 截止當日的全量最新
select
nvl(new_id,old_id),
nvl(new_login_name,old_login_name),
nvl(new_nick_name,old_nick_name),
nvl(new_name,old_name),
nvl(new_phone_num,old_phone_num),
nvl(new_email,old_email),
nvl(new_user_level,old_user_level),
nvl(new_birthday,old_birthday),
nvl(new_gender,old_gender),
nvl(new_create_time,old_create_time),
nvl(new_operate_time,old_operate_time),
nvl(new_start_date,old_start_date),
nvl(new_end_date,old_end_date),
nvl(new_end_date,old_end_date) dt
from tmp
union all
-- 過期狀態
select
old_id,
old_login_name,
old_nick_name,
old_name,
old_phone_num,
old_email,
old_user_level,
old_birthday,
old_gender,
old_create_time,
old_operate_time,
old_start_date,
cast(date_add('2020-06-15',-1) as string),
cast(date_add('2020-06-15',-1) as string) dt
from tmp
where new_id is not null and old_id is not null;
NVL()函數的功能是實現空值的轉換。例如
NVL(string1,replace_with)中:
當第一個參數(string1)為空時,返回第二個參數(replace_with);
當第一個參數(string1)不為空時,則返回第一個參數(string1)。
2.7 DIM層首日數據裝載腳本
1)編寫腳本
(1)在/home/atguigu/bin目錄下創建腳本ods_to_dim_db_init.sh
[atguigu@hadoop102 bin]$ vim ods_to_dim_db_init.sh
在腳本中填寫如下內容
#!/bin/bash
APP=gmall
if [ -n "$2" ] ;then
do_date=$2
else
echo "請傳入日期參數"
exit
fi
dim_user_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dim_user_info partition(dt='9999-99-99')
select
id,
login_name,
nick_name,
md5(name),
md5(phone_num),
md5(email),
user_level,
birthday,
gender,
create_time,
operate_time,
'$do_date',
'9999-99-99'
from ${APP}.ods_user_info
where dt='$do_date';
"
dim_sku_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
with
sku as
(
select
id,
price,
sku_name,
sku_desc,
weight,
is_sale,
spu_id,
category3_id,
tm_id,
create_time
from ${APP}.ods_sku_info
where dt='$do_date'
),
spu as
(
select
id,
spu_name
from ${APP}.ods_spu_info
where dt='$do_date'
),
c3 as
(
select
id,
name,
category2_id
from ${APP}.ods_base_category3
where dt='$do_date'
),
c2 as
(
select
id,
name,
category1_id
from ${APP}.ods_base_category2
where dt='$do_date'
),
c1 as
(
select
id,
name
from ${APP}.ods_base_category1
where dt='$do_date'
),
tm as
(
select
id,
tm_name
from ${APP}.ods_base_trademark
where dt='$do_date'
),
attr as
(
select
sku_id,
collect_set(named_struct('attr_id',attr_id,'value_id',value_id,'attr_name',attr_name,'value_name',value_name)) attrs
from ${APP}.ods_sku_attr_value
where dt='$do_date'
group by sku_id
),
sale_attr as
(
select
sku_id,
collect_set(named_struct('sale_attr_id',sale_attr_id,'sale_attr_value_id',sale_attr_value_id,'sale_attr_name',sale_attr_name,'sale_attr_value_name',sale_attr_value_name)) sale_attrs
from ${APP}.ods_sku_sale_attr_value
where dt='$do_date'
group by sku_id
)
insert overwrite table ${APP}.dim_sku_info partition(dt='$do_date')
select
sku.id,
sku.price,
sku.sku_name,
sku.sku_desc,
sku.weight,
sku.is_sale,
sku.spu_id,
spu.spu_name,
sku.category3_id,
c3.name,
c3.category2_id,
c2.name,
c2.category1_id,
c1.name,
sku.tm_id,
tm.tm_name,
attr.attrs,
sale_attr.sale_attrs,
sku.create_time
from sku
left join spu on sku.spu_id=spu.id
left join c3 on sku.category3_id=c3.id
left join c2 on c3.category2_id=c2.id
left join c1 on c2.category1_id=c1.id
left join tm on sku.tm_id=tm.id
left join attr on sku.id=attr.sku_id
left join sale_attr on sku.id=sale_attr.sku_id;
"
dim_base_province="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dim_base_province
select
bp.id,
bp.name,
bp.area_code,
bp.iso_code,
bp.iso_3166_2,
bp.region_id,
br.region_name
from ${APP}.ods_base_province bp
join ${APP}.ods_base_region br on bp.region_id = br.id;
"
dim_coupon_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dim_coupon_info partition(dt='$do_date')
select
id,
coupon_name,
coupon_type,
condition_amount,
condition_num,
activity_id,
benefit_amount,
benefit_discount,
create_time,
range_type,
limit_num,
taken_count,
start_time,
end_time,
operate_time,
expire_time
from ${APP}.ods_coupon_info
where dt='$do_date';
"
dim_activity_rule_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dim_activity_rule_info partition(dt='$do_date')
select
ar.id,
ar.activity_id,
ai.activity_name,
ar.activity_type,
ai.start_time,
ai.end_time,
ai.create_time,
ar.condition_amount,
ar.condition_num,
ar.benefit_amount,
ar.benefit_discount,
ar.benefit_level
from
(
select
id,
activity_id,
activity_type,
condition_amount,
condition_num,
benefit_amount,
benefit_discount,
benefit_level
from ${APP}.ods_activity_rule
where dt='$do_date'
)ar
left join
(
select
id,
activity_name,
start_time,
end_time,
create_time
from ${APP}.ods_activity_info
where dt='$do_date'
)ai
on ar.activity_id=ai.id;
"
case $1 in
"dim_user_info"){
hive -e "$dim_user_info"
};;
"dim_sku_info"){
hive -e "$dim_sku_info"
};;
"dim_base_province"){
hive -e "$dim_base_province"
};;
"dim_coupon_info"){
hive -e "$dim_coupon_info"
};;
"dim_activity_rule_info"){
hive -e "$dim_activity_rule_info"
};;
"all"){
hive -e "$dim_user_info$dim_sku_info$dim_coupon_info$dim_activity_rule_info$dim_base_province"
};;
esac
(2)增加執行權限
[atguigu@hadoop102 bin]$ chmod +x ods_to_dim_db_init.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 bin]$ ods_to_dim_db_init.sh all 2020-06-14
注意:該腳本不包含時間維度表的裝載,時間維度表需手動裝載數據,參考3.5節。
(2)查看數據是否導入成功
2.8 DIM層每日數據裝載腳本
1)編寫腳本
(1)在/home/atguigu/bin目錄下創建腳本ods_to_dim_db.sh
[atguigu@hadoop102 bin]$ vim ods_to_dim_db.sh
在腳本中填寫如下內容
#!/bin/bash
APP=gmall
# 如果是輸入的日期按照取輸入日期;如果沒輸入日期取當前時間的前一天
if [ -n "$2" ] ;then
do_date=$2
else
do_date=`date -d "-1 day" +%F`
fi
dim_user_info="
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
with
tmp as
(
select
old.id old_id,
old.login_name old_login_name,
old.nick_name old_nick_name,
old.name old_name,
old.phone_num old_phone_num,
old.email old_email,
old.user_level old_user_level,
old.birthday old_birthday,
old.gender old_gender,
old.create_time old_create_time,
old.operate_time old_operate_time,
old.start_date old_start_date,
old.end_date old_end_date,
new.id new_id,
new.login_name new_login_name,
new.nick_name new_nick_name,
new.name new_name,
new.phone_num new_phone_num,
new.email new_email,
new.user_level new_user_level,
new.birthday new_birthday,
new.gender new_gender,
new.create_time new_create_time,
new.operate_time new_operate_time,
new.start_date new_start_date,
new.end_date new_end_date
from
(
select
id,
login_name,
nick_name,
name,
phone_num,
email,
user_level,
birthday,
gender,
create_time,
operate_time,
start_date,
end_date
from ${APP}.dim_user_info
where dt='9999-99-99'
and start_date<'$do_date'
)old
full outer join
(
select
id,
login_name,
nick_name,
md5(name) name,
md5(phone_num) phone_num,
md5(email) email,
user_level,
birthday,
gender,
create_time,
operate_time,
'$do_date' start_date,
'9999-99-99' end_date
from ${APP}.ods_user_info
where dt='$do_date'
)new
on old.id=new.id
)
insert overwrite table ${APP}.dim_user_info partition(dt)
select
nvl(new_id,old_id),
nvl(new_login_name,old_login_name),
nvl(new_nick_name,old_nick_name),
nvl(new_name,old_name),
nvl(new_phone_num,old_phone_num),
nvl(new_email,old_email),
nvl(new_user_level,old_user_level),
nvl(new_birthday,old_birthday),
nvl(new_gender,old_gender),
nvl(new_create_time,old_create_time),
nvl(new_operate_time,old_operate_time),
nvl(new_start_date,old_start_date),
nvl(new_end_date,old_end_date),
nvl(new_end_date,old_end_date) dt
from tmp
union all
select
old_id,
old_login_name,
old_nick_name,
old_name,
old_phone_num,
old_email,
old_user_level,
old_birthday,
old_gender,
old_create_time,
old_operate_time,
old_start_date,
cast(date_add('$do_date',-1) as string),
cast(date_add('$do_date',-1) as string) dt
from tmp
where new_id is not null and old_id is not null;
"
dim_sku_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
with
sku as
(
select
id,
price,
sku_name,
sku_desc,
weight,
is_sale,
spu_id,
category3_id,
tm_id,
create_time
from ${APP}.ods_sku_info
where dt='$do_date'
),
spu as
(
select
id,
spu_name
from ${APP}.ods_spu_info
where dt='$do_date'
),
c3 as
(
select
id,
name,
category2_id
from ${APP}.ods_base_category3
where dt='$do_date'
),
c2 as
(
select
id,
name,
category1_id
from ${APP}.ods_base_category2
where dt='$do_date'
),
c1 as
(
select
id,
name
from ${APP}.ods_base_category1
where dt='$do_date'
),
tm as
(
select
id,
tm_name
from ${APP}.ods_base_trademark
where dt='$do_date'
),
attr as
(
select
sku_id,
collect_set(named_struct('attr_id',attr_id,'value_id',value_id,'attr_name',attr_name,'value_name',value_name)) attrs
from ${APP}.ods_sku_attr_value
where dt='$do_date'
group by sku_id
),
sale_attr as
(
select
sku_id,
collect_set(named_struct('sale_attr_id',sale_attr_id,'sale_attr_value_id',sale_attr_value_id,'sale_attr_name',sale_attr_name,'sale_attr_value_name',sale_attr_value_name)) sale_attrs
from ${APP}.ods_sku_sale_attr_value
where dt='$do_date'
group by sku_id
)
insert overwrite table ${APP}.dim_sku_info partition(dt='$do_date')
select
sku.id,
sku.price,
sku.sku_name,
sku.sku_desc,
sku.weight,
sku.is_sale,
sku.spu_id,
spu.spu_name,
sku.category3_id,
c3.name,
c3.category2_id,
c2.name,
c2.category1_id,
c1.name,
sku.tm_id,
tm.tm_name,
attr.attrs,
sale_attr.sale_attrs,
sku.create_time
from sku
left join spu on sku.spu_id=spu.id
left join c3 on sku.category3_id=c3.id
left join c2 on c3.category2_id=c2.id
left join c1 on c2.category1_id=c1.id
left join tm on sku.tm_id=tm.id
left join attr on sku.id=attr.sku_id
left join sale_attr on sku.id=sale_attr.sku_id;
"
dim_base_province="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dim_base_province
select
bp.id,
bp.name,
bp.area_code,
bp.iso_code,
bp.iso_3166_2,
bp.region_id,
bp.name
from ${APP}.ods_base_province bp
join ${APP}.ods_base_region br on bp.region_id = br.id;
"
dim_coupon_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dim_coupon_info partition(dt='$do_date')
select
id,
coupon_name,
coupon_type,
condition_amount,
condition_num,
activity_id,
benefit_amount,
benefit_discount,
create_time,
range_type,
limit_num,
taken_count,
start_time,
end_time,
operate_time,
expire_time
from ${APP}.ods_coupon_info
where dt='$do_date';
"
dim_activity_rule_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dim_activity_rule_info partition(dt='$do_date')
select
ar.id,
ar.activity_id,
ai.activity_name,
ar.activity_type,
ai.start_time,
ai.end_time,
ai.create_time,
ar.condition_amount,
ar.condition_num,
ar.benefit_amount,
ar.benefit_discount,
ar.benefit_level
from
(
select
id,
activity_id,
activity_type,
condition_amount,
condition_num,
benefit_amount,
benefit_discount,
benefit_level
from ${APP}.ods_activity_rule
where dt='$do_date'
)ar
left join
(
select
id,
activity_name,
start_time,
end_time,
create_time
from ${APP}.ods_activity_info
where dt='$do_date'
)ai
on ar.activity_id=ai.id;
"
case $1 in
"dim_user_info"){
hive -e "$dim_user_info"
};;
"dim_sku_info"){
hive -e "$dim_sku_info"
};;
"dim_base_province"){
hive -e "$dim_base_province"
};;
"dim_coupon_info"){
hive -e "$dim_coupon_info"
};;
"dim_activity_rule_info"){
hive -e "$dim_activity_rule_info"
};;
"all"){
hive -e "$dim_user_info$dim_sku_info$dim_coupon_info$dim_activity_rule_info"
};;
esac
(2)增加執行權限
[atguigu@hadoop102 bin]$ chmod +x ods_to_dim_db.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 bin]$ ods_to_dim_db.sh all 2020-06-14
(2)查看數據是否導入成功
第三章 數倉搭建-DWD層
1)對用戶行為數據解析。
2)對業務數據采用維度模型重新建模。
使用Parquet格式進行列式存儲,保存為LZO壓縮格式,以減少存儲空間的占用。
Parquet的缺點
- 不支持update, insert, delete, ACID
Parquet的應用
- 適用於字段數非常多,無更新,只取部分列的查詢。
3.1 DWD層(用戶行為日志)
3.1.1 日志解析思路
1)日志結構回顧
(1)頁面埋點日志

(2)啟動日志

2)日志解析思路
日志數據為JSON格式,Hive內置了JSON字符串解析工具,從而可以得到字符串內字段的對應信息,根據實例數據中的字段信息,可以確定啟動日志表中所包含的字段。

3.1.2 get_json_object函數使用
1)數據
[{"name":"大郎","sex":"男","age":"25"},{"name":"西門慶","sex":"男","age":"47"}]
2)取出第一個json對象
hive (gmall)>
select get_json_object('[{"name":"大郎","sex":"男","age":"25"},{"name":"西門 慶","sex":"男","age":"47"}]','$[0]');
結果是:
{"name":"大郎","sex":"男","age":"25"}
3)取出第一個json的age字段的值
hive (gmall)>
SELECT get_json_object('[{"name":"大郎","sex":"男","age":"25"},{"name":"西門慶","sex":"男","age":"47"}]',"$[0].age");
結果是:25
3.1.3 啟動日志表
啟動日志解析思路:啟動日志表中每行數據對應一個啟動記錄,一個啟動記錄應該包含日志中的公共信息和啟動信息。先將所有包含start字段的日志過濾出來,然后使用get_json_object函數解析每個字段。

1)建表語句
DROP TABLE IF EXISTS dwd_start_log;
CREATE EXTERNAL TABLE dwd_start_log(
`area_code` STRING COMMENT '地區編碼',
`brand` STRING COMMENT '手機品牌',
`channel` STRING COMMENT '渠道',
`is_new` STRING COMMENT '是否首次啟動',
`model` STRING COMMENT '手機型號',
`mid_id` STRING COMMENT '設備id',
`os` STRING COMMENT '操作系統',
`user_id` STRING COMMENT '會員id',
`version_code` STRING COMMENT 'app版本號',
`entry` STRING COMMENT 'icon手機圖標 notice 通知 install 安裝后啟動',
`loading_time` BIGINT COMMENT '啟動加載時間',
`open_ad_id` STRING COMMENT '廣告頁ID ',
`open_ad_ms` BIGINT COMMENT '廣告總共播放時間',
`open_ad_skip_ms` BIGINT COMMENT '用戶跳過廣告時點',
`ts` BIGINT COMMENT '時間'
) COMMENT '啟動日志表'
PARTITIONED BY (`dt` STRING) -- 按照時間創建分區
STORED AS PARQUET -- 采用parquet列式存儲
LOCATION '/warehouse/gmall/dwd/dwd_start_log' -- 指定在HDFS上存儲位置
TBLPROPERTIES('parquet.compression'='lzo') -- 采用LZO壓縮
;
2)數據導入

insert overwrite table dwd_start_log partition(dt='2020-06-14')
select
get_json_object(line,'$.common.ar'),
get_json_object(line,'$.common.ba'),
get_json_object(line,'$.common.ch'),
get_json_object(line,'$.common.is_new'),
get_json_object(line,'$.common.md'),
get_json_object(line,'$.common.mid'),
get_json_object(line,'$.common.os'),
get_json_object(line,'$.common.uid'),
get_json_object(line,'$.common.vc'),
get_json_object(line,'$.start.entry'),
get_json_object(line,'$.start.loading_time'),
get_json_object(line,'$.start.open_ad_id'),
get_json_object(line,'$.start.open_ad_ms'),
get_json_object(line,'$.start.open_ad_skip_ms'),
get_json_object(line,'$.ts')
from ods_log
where dt='2020-06-14'
and get_json_object(line,'$.start') is not null;
3)查看數據
select * from dwd_start_log where dt='2020-06-14' limit 2;

3.1.4 頁面日志表
頁面日志解析思路:頁面日志表中每行數據對應一個頁面訪問記錄,一個頁面訪問記錄應該包含日志中的公共信息和頁面信息。先將所有包含page字段的日志過濾出來,然后使用get_json_object函數解析每個字段。

1)建表語句
DROP TABLE IF EXISTS dwd_page_log;
CREATE EXTERNAL TABLE dwd_page_log(
`area_code` STRING COMMENT '地區編碼',
`brand` STRING COMMENT '手機品牌',
`channel` STRING COMMENT '渠道',
`is_new` STRING COMMENT '是否首次啟動',
`model` STRING COMMENT '手機型號',
`mid_id` STRING COMMENT '設備id',
`os` STRING COMMENT '操作系統',
`user_id` STRING COMMENT '會員id',
`version_code` STRING COMMENT 'app版本號',
`during_time` BIGINT COMMENT '持續時間毫秒',
`page_item` STRING COMMENT '目標id ',
`page_item_type` STRING COMMENT '目標類型',
`last_page_id` STRING COMMENT '上頁類型',
`page_id` STRING COMMENT '頁面ID ',
`source_type` STRING COMMENT '來源類型',
`ts` bigint
) COMMENT '頁面日志表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_page_log'
TBLPROPERTIES('parquet.compression'='lzo');
2)數據導入
insert overwrite table dwd_page_log partition(dt='2020-06-14')
select
get_json_object(line,'$.common.ar'),
get_json_object(line,'$.common.ba'),
get_json_object(line,'$.common.ch'),
get_json_object(line,'$.common.is_new'),
get_json_object(line,'$.common.md'),
get_json_object(line,'$.common.mid'),
get_json_object(line,'$.common.os'),
get_json_object(line,'$.common.uid'),
get_json_object(line,'$.common.vc'),
get_json_object(line,'$.page.during_time'),
get_json_object(line,'$.page.item'),
get_json_object(line,'$.page.item_type'),
get_json_object(line,'$.page.last_page_id'),
get_json_object(line,'$.page.page_id'),
get_json_object(line,'$.page.source_type'),
get_json_object(line,'$.ts')
from ods_log
where dt='2020-06-14'
and get_json_object(line,'$.page') is not null;
3)查看數據
select * from dwd_page_log where dt='2020-06-14' limit 2;

3.1.5 動作日志表
動作日志解析思路:動作日志表中每行數據對應用戶的一個動作記錄,一個動作記錄應當包含公共信息、頁面信息以及動作信息。先將包含action字段的日志過濾出來,然后通過UDTF函數,將action數組“炸開”(類似於explode函數的效果),然后使用get_json_object函數解析每個字段。

1)建表語句
DROP TABLE IF EXISTS dwd_action_log;
CREATE EXTERNAL TABLE dwd_action_log(
`area_code` STRING COMMENT '地區編碼',
`brand` STRING COMMENT '手機品牌',
`channel` STRING COMMENT '渠道',
`is_new` STRING COMMENT '是否首次啟動',
`model` STRING COMMENT '手機型號',
`mid_id` STRING COMMENT '設備id',
`os` STRING COMMENT '操作系統',
`user_id` STRING COMMENT '會員id',
`version_code` STRING COMMENT 'app版本號',
`during_time` BIGINT COMMENT '持續時間毫秒',
`page_item` STRING COMMENT '目標id ',
`page_item_type` STRING COMMENT '目標類型',
`last_page_id` STRING COMMENT '上頁類型',
`page_id` STRING COMMENT '頁面id ',
`source_type` STRING COMMENT '來源類型',
`action_id` STRING COMMENT '動作id',
`item` STRING COMMENT '目標id ',
`item_type` STRING COMMENT '目標類型',
`ts` BIGINT COMMENT '時間'
) COMMENT '動作日志表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_action_log'
TBLPROPERTIES('parquet.compression'='lzo');
2)創建UDTF函數——設計思路

3)創建UDTF函數——編寫代碼
(1)創建一個maven工程:hivefunction
(2)創建包名:com.atguigu.hive.udtf
(3)引入如下依賴
<dependencies>
<!--添加hive依賴-->
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>3.1.2</version>
</dependency>
</dependencies>
(4)編碼
package com.atguigu.gmall.hive.udtf;
import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDTF;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.json.JSONArray;
import java.util.ArrayList;
import java.util.List;
public class ExplodeJSONArray extends GenericUDTF {
@Override
public StructObjectInspector initialize(ObjectInspector[] argOIs) throws UDFArgumentException {
// 1 參數合法性檢查
if (argOIs.length != 1) {
throw new UDFArgumentException("explode_json_array 只需要一個參數");
}
// 2 第一個參數必須為string
//判斷參數是否為基礎數據類型
if (argOIs[0].getCategory() != ObjectInspector.Category.PRIMITIVE) {
throw new UDFArgumentException("explode_json_array 只接受基礎類型參數");
}
//將參數對象檢查器強轉為基礎類型對象檢查器
PrimitiveObjectInspector argumentOI = (PrimitiveObjectInspector) argOIs[0];
//判斷參數是否為String類型
if (argumentOI.getPrimitiveCategory() != PrimitiveObjectInspector.PrimitiveCategory.STRING) {
throw new UDFArgumentException("explode_json_array 只接受string類型的參數");
}
// 3 定義返回值名稱和類型
List<String> fieldNames = new ArrayList<String>();
List<ObjectInspector> fieldOIs = new ArrayList<ObjectInspector>();
fieldNames.add("items");
fieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);
return ObjectInspectorFactory.getStandardStructObjectInspector(fieldNames, fieldOIs);
}
public void process(Object[] objects) throws HiveException {
// 1 獲取傳入的數據
String jsonArray = objects[0].toString();
// 2 將string轉換為json數組
JSONArray actions = new JSONArray(jsonArray);
// 3 循環一次,取出數組中的一個json,並寫出
for (int i = 0; i < actions.length(); i++) {
String[] result = new String[1];
result[0] = actions.getString(i);
forward(result);
}
}
public void close() throws HiveException {
}
}
4)創建函數
(1)打包
(2)將hivefunction-1.0-SNAPSHOT.jar上傳到hadoop102的/opt/module,然后再將該jar包上傳到HDFS的/user/hive/jars路徑下
[atguigu@hadoop102 module]$ hadoop fs -mkdir -p /user/hive/jars
[atguigu@hadoop102 module]$ hadoop fs -put hivefunction-1.0-SNAPSHOT.jar /user/hive/jars
(3)創建永久函數與開發好的java class關聯
create function explode_json_array as 'com.atguigu.gmall.hive.udtf.ExplodeJSONArray' using jar 'hdfs://hadoop102:8020/user/hive/jars/hivefunction-1.0-SNAPSHOT.jar';
(4)注意:如果修改了自定義函數重新生成jar包怎么處理?只需要替換HDFS路徑上的舊jar包,然后重啟Hive客戶端即可。
5)數據導入
insert overwrite table dwd_action_log partition(dt='2020-06-14')
select
get_json_object(line,'$.common.ar'),
get_json_object(line,'$.common.ba'),
get_json_object(line,'$.common.ch'),
get_json_object(line,'$.common.is_new'),
get_json_object(line,'$.common.md'),
get_json_object(line,'$.common.mid'),
get_json_object(line,'$.common.os'),
get_json_object(line,'$.common.uid'),
get_json_object(line,'$.common.vc'),
get_json_object(line,'$.page.during_time'),
get_json_object(line,'$.page.item'),
get_json_object(line,'$.page.item_type'),
get_json_object(line,'$.page.last_page_id'),
get_json_object(line,'$.page.page_id'),
get_json_object(line,'$.page.source_type'),
get_json_object(action,'$.action_id'),
get_json_object(action,'$.item'),
get_json_object(action,'$.item_type'),
get_json_object(action,'$.ts')
from ods_log lateral view explode_json_array(get_json_object(line,'$.actions')) tmp as action
where dt='2020-06-14'
and get_json_object(line,'$.actions') is not null;
3)查看數據
select * from dwd_action_log where dt='2020-06-14' limit 2;

3.1.6 曝光日志表
曝光日志解析思路:曝光日志表中每行數據對應一個曝光記錄,一個曝光記錄應當包含公共信息、頁面信息以及曝光信息。先將包含display字段的日志過濾出來,然后通過UDTF函數,將display數組“炸開”(類似於explode函數的效果),然后使用get_json_object函數解析每個字段。
1)建表語句
DROP TABLE IF EXISTS dwd_display_log;
CREATE EXTERNAL TABLE dwd_display_log(
`area_code` STRING COMMENT '地區編碼',
`brand` STRING COMMENT '手機品牌',
`channel` STRING COMMENT '渠道',
`is_new` STRING COMMENT '是否首次啟動',
`model` STRING COMMENT '手機型號',
`mid_id` STRING COMMENT '設備id',
`os` STRING COMMENT '操作系統',
`user_id` STRING COMMENT '會員id',
`version_code` STRING COMMENT 'app版本號',
`during_time` BIGINT COMMENT 'app版本號',
`page_item` STRING COMMENT '目標id ',
`page_item_type` STRING COMMENT '目標類型',
`last_page_id` STRING COMMENT '上頁類型',
`page_id` STRING COMMENT '頁面ID ',
`source_type` STRING COMMENT '來源類型',
`ts` BIGINT COMMENT 'app版本號',
`display_type` STRING COMMENT '曝光類型',
`item` STRING COMMENT '曝光對象id ',
`item_type` STRING COMMENT 'app版本號',
`order` BIGINT COMMENT '曝光順序',
`pos_id` BIGINT COMMENT '曝光位置'
) COMMENT '曝光日志表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_display_log'
TBLPROPERTIES('parquet.compression'='lzo');
2)數據導入
insert overwrite table dwd_display_log partition(dt='2020-06-14')
select
get_json_object(line,'$.common.ar'),
get_json_object(line,'$.common.ba'),
get_json_object(line,'$.common.ch'),
get_json_object(line,'$.common.is_new'),
get_json_object(line,'$.common.md'),
get_json_object(line,'$.common.mid'),
get_json_object(line,'$.common.os'),
get_json_object(line,'$.common.uid'),
get_json_object(line,'$.common.vc'),
get_json_object(line,'$.page.during_time'),
get_json_object(line,'$.page.item'),
get_json_object(line,'$.page.item_type'),
get_json_object(line,'$.page.last_page_id'),
get_json_object(line,'$.page.page_id'),
get_json_object(line,'$.page.source_type'),
get_json_object(line,'$.ts'),
get_json_object(display,'$.display_type'),
get_json_object(display,'$.item'),
get_json_object(display,'$.item_type'),
get_json_object(display,'$.order'),
get_json_object(display,'$.pos_id')
from ods_log lateral view explode_json_array(get_json_object(line,'$.displays')) tmp as display
where dt='2020-06-14'
and get_json_object(line,'$.displays') is not null;
3)查看數據
select * from dwd_display_log where dt='2020-06-14' limit 2;

3.1.7 錯誤日志表
錯誤日志解析思路:錯誤日志表中每行數據對應一個錯誤記錄,為方便定位錯誤,一個錯誤記錄應當包含與之對應的公共信息、頁面信息、曝光信息、動作信息、啟動信息以及錯誤信息。先將包含err字段的日志過濾出來,然后使用get_json_object函數解析所有字段。

1)建表語句
DROP TABLE IF EXISTS dwd_error_log;
CREATE EXTERNAL TABLE dwd_error_log(
`area_code` STRING COMMENT '地區編碼',
`brand` STRING COMMENT '手機品牌',
`channel` STRING COMMENT '渠道',
`is_new` STRING COMMENT '是否首次啟動',
`model` STRING COMMENT '手機型號',
`mid_id` STRING COMMENT '設備id',
`os` STRING COMMENT '操作系統',
`user_id` STRING COMMENT '會員id',
`version_code` STRING COMMENT 'app版本號',
`page_item` STRING COMMENT '目標id ',
`page_item_type` STRING COMMENT '目標類型',
`last_page_id` STRING COMMENT '上頁類型',
`page_id` STRING COMMENT '頁面ID ',
`source_type` STRING COMMENT '來源類型',
`entry` STRING COMMENT ' icon手機圖標 notice 通知 install 安裝后啟動',
`loading_time` STRING COMMENT '啟動加載時間',
`open_ad_id` STRING COMMENT '廣告頁ID ',
`open_ad_ms` STRING COMMENT '廣告總共播放時間',
`open_ad_skip_ms` STRING COMMENT '用戶跳過廣告時點',
`actions` STRING COMMENT '動作',
`displays` STRING COMMENT '曝光',
`ts` STRING COMMENT '時間',
`error_code` STRING COMMENT '錯誤碼',
`msg` STRING COMMENT '錯誤信息'
) COMMENT '錯誤日志表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_error_log'
TBLPROPERTIES('parquet.compression'='lzo');
說明:此處為對動作數組和曝光數組做處理,如需分析錯誤與單個動作或曝光的關聯,可先使用explode_json_array函數將數組“炸開”,再使用get_json_object函數獲取具體字段。
4)數據導入
insert overwrite table dwd_error_log partition(dt='2020-06-14')
select
get_json_object(line,'$.common.ar'),
get_json_object(line,'$.common.ba'),
get_json_object(line,'$.common.ch'),
get_json_object(line,'$.common.is_new'),
get_json_object(line,'$.common.md'),
get_json_object(line,'$.common.mid'),
get_json_object(line,'$.common.os'),
get_json_object(line,'$.common.uid'),
get_json_object(line,'$.common.vc'),
get_json_object(line,'$.page.item'),
get_json_object(line,'$.page.item_type'),
get_json_object(line,'$.page.last_page_id'),
get_json_object(line,'$.page.page_id'),
get_json_object(line,'$.page.source_type'),
get_json_object(line,'$.start.entry'),
get_json_object(line,'$.start.loading_time'),
get_json_object(line,'$.start.open_ad_id'),
get_json_object(line,'$.start.open_ad_ms'),
get_json_object(line,'$.start.open_ad_skip_ms'),
get_json_object(line,'$.actions'),
get_json_object(line,'$.displays'),
get_json_object(line,'$.ts'),
get_json_object(line,'$.err.error_code'),
get_json_object(line,'$.err.msg')
from ods_log
where dt='2020-06-14'
and get_json_object(line,'$.err') is not null;
5)查看數據
select * from dwd_error_log where dt='2020-06-14' limit 2;

3.1.8 DWD層用戶行為數據加載腳本
1)編寫腳本
(1)在hadoop102的/home/atguigu/bin目錄下創建腳本
[atguigu@hadoop102 bin]$ vim ods_to_dwd_log.sh
在腳本中編寫如下內容
#!/bin/bash
APP=gmall
# 如果是輸入的日期按照取輸入日期;如果沒輸入日期取當前時間的前一天
if [ -n "$2" ] ;then
do_date=$2
else
do_date=`date -d "-1 day" +%F`
fi
dwd_start_log="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_start_log partition(dt='$do_date')
select
get_json_object(line,'$.common.ar'),
get_json_object(line,'$.common.ba'),
get_json_object(line,'$.common.ch'),
get_json_object(line,'$.common.is_new'),
get_json_object(line,'$.common.md'),
get_json_object(line,'$.common.mid'),
get_json_object(line,'$.common.os'),
get_json_object(line,'$.common.uid'),
get_json_object(line,'$.common.vc'),
get_json_object(line,'$.start.entry'),
get_json_object(line,'$.start.loading_time'),
get_json_object(line,'$.start.open_ad_id'),
get_json_object(line,'$.start.open_ad_ms'),
get_json_object(line,'$.start.open_ad_skip_ms'),
get_json_object(line,'$.ts')
from ${APP}.ods_log
where dt='$do_date'
and get_json_object(line,'$.start') is not null;"
dwd_page_log="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_page_log partition(dt='$do_date')
select
get_json_object(line,'$.common.ar'),
get_json_object(line,'$.common.ba'),
get_json_object(line,'$.common.ch'),
get_json_object(line,'$.common.is_new'),
get_json_object(line,'$.common.md'),
get_json_object(line,'$.common.mid'),
get_json_object(line,'$.common.os'),
get_json_object(line,'$.common.uid'),
get_json_object(line,'$.common.vc'),
get_json_object(line,'$.page.during_time'),
get_json_object(line,'$.page.item'),
get_json_object(line,'$.page.item_type'),
get_json_object(line,'$.page.last_page_id'),
get_json_object(line,'$.page.page_id'),
get_json_object(line,'$.page.source_type'),
get_json_object(line,'$.ts')
from ${APP}.ods_log
where dt='$do_date'
and get_json_object(line,'$.page') is not null;"
dwd_action_log="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_action_log partition(dt='$do_date')
select
get_json_object(line,'$.common.ar'),
get_json_object(line,'$.common.ba'),
get_json_object(line,'$.common.ch'),
get_json_object(line,'$.common.is_new'),
get_json_object(line,'$.common.md'),
get_json_object(line,'$.common.mid'),
get_json_object(line,'$.common.os'),
get_json_object(line,'$.common.uid'),
get_json_object(line,'$.common.vc'),
get_json_object(line,'$.page.during_time'),
get_json_object(line,'$.page.item'),
get_json_object(line,'$.page.item_type'),
get_json_object(line,'$.page.last_page_id'),
get_json_object(line,'$.page.page_id'),
get_json_object(line,'$.page.source_type'),
get_json_object(action,'$.action_id'),
get_json_object(action,'$.item'),
get_json_object(action,'$.item_type'),
get_json_object(action,'$.ts')
from ${APP}.ods_log lateral view ${APP}.explode_json_array(get_json_object(line,'$.actions')) tmp as action
where dt='$do_date'
and get_json_object(line,'$.actions') is not null;"
dwd_display_log="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_display_log partition(dt='$do_date')
select
get_json_object(line,'$.common.ar'),
get_json_object(line,'$.common.ba'),
get_json_object(line,'$.common.ch'),
get_json_object(line,'$.common.is_new'),
get_json_object(line,'$.common.md'),
get_json_object(line,'$.common.mid'),
get_json_object(line,'$.common.os'),
get_json_object(line,'$.common.uid'),
get_json_object(line,'$.common.vc'),
get_json_object(line,'$.page.during_time'),
get_json_object(line,'$.page.item'),
get_json_object(line,'$.page.item_type'),
get_json_object(line,'$.page.last_page_id'),
get_json_object(line,'$.page.page_id'),
get_json_object(line,'$.page.source_type'),
get_json_object(line,'$.ts'),
get_json_object(display,'$.display_type'),
get_json_object(display,'$.item'),
get_json_object(display,'$.item_type'),
get_json_object(display,'$.order'),
get_json_object(display,'$.pos_id')
from ${APP}.ods_log lateral view ${APP}.explode_json_array(get_json_object(line,'$.displays')) tmp as display
where dt='$do_date'
and get_json_object(line,'$.displays') is not null;"
dwd_error_log="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_error_log partition(dt='$do_date')
select
get_json_object(line,'$.common.ar'),
get_json_object(line,'$.common.ba'),
get_json_object(line,'$.common.ch'),
get_json_object(line,'$.common.is_new'),
get_json_object(line,'$.common.md'),
get_json_object(line,'$.common.mid'),
get_json_object(line,'$.common.os'),
get_json_object(line,'$.common.uid'),
get_json_object(line,'$.common.vc'),
get_json_object(line,'$.page.item'),
get_json_object(line,'$.page.item_type'),
get_json_object(line,'$.page.last_page_id'),
get_json_object(line,'$.page.page_id'),
get_json_object(line,'$.page.source_type'),
get_json_object(line,'$.start.entry'),
get_json_object(line,'$.start.loading_time'),
get_json_object(line,'$.start.open_ad_id'),
get_json_object(line,'$.start.open_ad_ms'),
get_json_object(line,'$.start.open_ad_skip_ms'),
get_json_object(line,'$.actions'),
get_json_object(line,'$.displays'),
get_json_object(line,'$.ts'),
get_json_object(line,'$.err.error_code'),
get_json_object(line,'$.err.msg')
from ${APP}.ods_log
where dt='$do_date'
and get_json_object(line,'$.err') is not null;"
case $1 in
dwd_start_log )
hive -e "$dwd_start_log"
;;
dwd_page_log )
hive -e "$dwd_page_log"
;;
dwd_action_log )
hive -e "$dwd_action_log"
;;
dwd_display_log )
hive -e "$dwd_display_log"
;;
dwd_error_log )
hive -e "$dwd_error_log"
;;
all )
hive -e "$dwd_start_log$dwd_page_log$dwd_action_log$dwd_display_log$dwd_error_log"
;;
esac
(2)增加腳本執行權限
[atguigu@hadoop102 bin]$ chmod 777 ods_to_dwd_log.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 module]$ ods_to_dwd_log.sh all 2020-06-14
(2)查詢導入結果
3.2 DWD層(業務數據)
DWD層中事實表的創建,則需要根據各張表的特點進行不同的處理。
3.2.1 評價事實表(事務型事實表)
評價事實表只與時間、用戶、商品三個維度有關,ODS層的商品評論表已經具有所有的關聯字段,所以無需從其他表格中獲得關聯。
源自的表:
ods_comment_info 評論表
1)建表語句
DROP TABLE IF EXISTS dwd_comment_info;
CREATE EXTERNAL TABLE dwd_comment_info(
`id` STRING COMMENT '編號',
`user_id` STRING COMMENT '用戶ID',
`sku_id` STRING COMMENT '商品sku',
`spu_id` STRING COMMENT '商品spu',
`order_id` STRING COMMENT '訂單ID',
`appraise` STRING COMMENT '評價(好評、中評、差評、默認評價)',
`create_time` STRING COMMENT '評價時間'
) COMMENT '評價事實表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_comment_info/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)分區規划

3)數據裝載

(1)首日裝載
insert overwrite table dwd_comment_info partition (dt)
select
id,
user_id,
sku_id,
spu_id,
order_id,
appraise,
create_time,
date_format(create_time,'yyyy-MM-dd')
from ods_comment_info
where dt='2020-06-14';
(2)每日裝載
insert overwrite table dwd_comment_info partition(dt='2020-06-15')
select
id,
user_id,
sku_id,
spu_id,
order_id,
appraise,
create_time
from ods_comment_info where dt='2020-06-15';
3.2.2 訂單明細事實表(事務型事實表)
源自的表:
ods_order_detail 訂單明細表
ods_order_info 訂單表
ods_order_detail_activity 訂單明細活動關聯表
ods_order_detail_coupon 訂單明細優惠券關聯表
1)建表語句
DROP TABLE IF EXISTS dwd_order_detail;
CREATE EXTERNAL TABLE dwd_order_detail (
`id` STRING COMMENT '訂單編號',
`order_id` STRING COMMENT '訂單號',
`user_id` STRING COMMENT '用戶id',
`sku_id` STRING COMMENT 'sku商品id',
`province_id` STRING COMMENT '省份ID',
`activity_id` STRING COMMENT '活動ID',
`activity_rule_id` STRING COMMENT '活動規則ID',
`coupon_id` STRING COMMENT '優惠券ID',
`create_time` STRING COMMENT '創建時間',
`source_type` STRING COMMENT '來源類型',
`source_id` STRING COMMENT '來源編號',
`sku_num` BIGINT COMMENT '商品數量',
`original_amount` DECIMAL(16,2) COMMENT '原始價格',
`split_activity_amount` DECIMAL(16,2) COMMENT '活動優惠分攤',
`split_coupon_amount` DECIMAL(16,2) COMMENT '優惠券優惠分攤',
`split_final_amount` DECIMAL(16,2) COMMENT '最終價格分攤'
) COMMENT '訂單明細事實表表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_order_detail/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)分區規划

3)數據裝載

(1)首日裝載
insert overwrite table dwd_order_detail partition(dt)
select
od.id,
od.order_id,
oi.user_id,
od.sku_id,
oi.province_id,
oda.activity_id,
oda.activity_rule_id,
odc.coupon_id,
od.create_time,
od.source_type,
od.source_id,
od.sku_num,
od.order_price*od.sku_num,
od.split_activity_amount,
od.split_coupon_amount,
od.split_final_amount,
date_format(create_time,'yyyy-MM-dd')
from
(
select
*
from ods_order_detail
where dt='2020-06-14'
)od
left join
(
select
id,
user_id,
province_id
from ods_order_info
where dt='2020-06-14'
)oi
on od.order_id=oi.id
left join
(
select
order_detail_id,
activity_id,
activity_rule_id
from ods_order_detail_activity
where dt='2020-06-14'
)oda
on od.id=oda.order_detail_id
left join
(
select
order_detail_id,
coupon_id
from ods_order_detail_coupon
where dt='2020-06-14'
)odc
on od.id=odc.order_detail_id;
(2)每日裝載
insert overwrite table dwd_order_detail partition(dt='2020-06-15')
select
od.id,
od.order_id,
oi.user_id,
od.sku_id,
oi.province_id,
oda.activity_id,
oda.activity_rule_id,
odc.coupon_id,
od.create_time,
od.source_type,
od.source_id,
od.sku_num,
od.order_price*od.sku_num,
od.split_activity_amount,
od.split_coupon_amount,
od.split_final_amount
from
(
select
*
from ods_order_detail
where dt='2020-06-15'
)od
left join
(
select
id,
user_id,
province_id
from ods_order_info
where dt='2020-06-15'
)oi
on od.order_id=oi.id
left join
(
select
order_detail_id,
activity_id,
activity_rule_id
from ods_order_detail_activity
where dt='2020-06-15'
)oda
on od.id=oda.order_detail_id
left join
(
select
order_detail_id,
coupon_id
from ods_order_detail_coupon
where dt='2020-06-15'
)odc
on od.id=odc.order_detail_id;
3.2.3 退單事實表(事務型事實表)
退單事實表與時間、地區、用戶、商品三個維度有關
源自的表:
ods_order_refund_info 退單表
ods_order_info 訂單表
1)建表語句
DROP TABLE IF EXISTS dwd_order_refund_info;
CREATE EXTERNAL TABLE dwd_order_refund_info(
`id` STRING COMMENT '編號',
`user_id` STRING COMMENT '用戶ID',
`order_id` STRING COMMENT '訂單ID',
`sku_id` STRING COMMENT '商品ID',
`province_id` STRING COMMENT '地區ID',
`refund_type` STRING COMMENT '退單類型',
`refund_num` BIGINT COMMENT '退單件數',
`refund_amount` DECIMAL(16,2) COMMENT '退單金額',
`refund_reason_type` STRING COMMENT '退單原因類型',
`create_time` STRING COMMENT '退單時間'
) COMMENT '退單事實表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_order_refund_info/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)分區規划

3)數據裝載

(1)首日裝載
insert overwrite table dwd_order_refund_info partition(dt)
select
ri.id,
ri.user_id,
ri.order_id,
ri.sku_id,
oi.province_id,
ri.refund_type,
ri.refund_num,
ri.refund_amount,
ri.refund_reason_type,
ri.create_time,
date_format(ri.create_time,'yyyy-MM-dd')
from
(
select * from ods_order_refund_info where dt='2020-06-14'
)ri
left join
(
select id,province_id from ods_order_info where dt='2020-06-14'
)oi
on ri.order_id=oi.id;
(2)每日裝載
insert overwrite table dwd_order_refund_info partition(dt='2020-06-15')
select
ri.id,
ri.user_id,
ri.order_id,
ri.sku_id,
oi.province_id,
ri.refund_type,
ri.refund_num,
ri.refund_amount,
ri.refund_reason_type,
ri.create_time
from
(
select * from ods_order_refund_info where dt='2020-06-15'
)ri
left join
(
select id,province_id from ods_order_info where dt='2020-06-15'
)oi
on ri.order_id=oi.id;
3)查詢加載結果
3.2.4 加購事實表(周期型快照事實表,每日快照)
由於購物車中的數據經常會發生變化,所以不適合采用每日增量同步策略導入數據。我們采用的策略是每天做一次快照,進行全量數據導入。這樣做的劣勢是存儲的數據量比較大。由於周期型快照事實表存儲的數據比較注重時效性,存儲時間過久遠的數據存在的意義不大嗎,所以可以定時刪除以前的數據來釋放內存。
源自的表:
ods_cart_info 購物車表
1)建表語句
DROP TABLE IF EXISTS dwd_cart_info;
CREATE EXTERNAL TABLE dwd_cart_info(
`id` STRING COMMENT '編號',
`user_id` STRING COMMENT '用戶ID',
`sku_id` STRING COMMENT '商品ID',
`source_type` STRING COMMENT '來源類型',
`source_id` STRING COMMENT '來源編號',
`cart_price` DECIMAL(16,2) COMMENT '加入購物車時的價格',
`is_ordered` STRING COMMENT '是否已下單',
`create_time` STRING COMMENT '創建時間',
`operate_time` STRING COMMENT '修改時間',
`order_time` STRING COMMENT '下單時間',
`sku_num` BIGINT COMMENT '加購數量'
) COMMENT '加購事實表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_cart_info/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)分區規划

3)數據裝載

(1)首日裝載
insert overwrite table dwd_cart_info partition(dt='2020-06-14')
select
id,
user_id,
sku_id,
source_type,
source_id,
cart_price,
is_ordered,
create_time,
operate_time,
order_time,
sku_num
from ods_cart_info
where dt='2020-06-14';
(2)每日裝載
insert overwrite table dwd_cart_info partition(dt='2020-06-15')
select
id,
user_id,
sku_id,
source_type,
source_id,
cart_price,
is_ordered,
create_time,
operate_time,
order_time,
sku_num
from ods_cart_info
where dt='2020-06-15';
3.2.5 收藏事實表(周期型快照事實表,每日快照)
收藏事實表采用的同步策略與架構事實表相同
源自的表:
ods_favor_info 收藏表
1)建表語句
DROP TABLE IF EXISTS dwd_favor_info;
CREATE EXTERNAL TABLE dwd_favor_info(
`id` STRING COMMENT '編號',
`user_id` STRING COMMENT '用戶id',
`sku_id` STRING COMMENT 'skuid',
`spu_id` STRING COMMENT 'spuid',
`is_cancel` STRING COMMENT '是否取消',
`create_time` STRING COMMENT '收藏時間',
`cancel_time` STRING COMMENT '取消時間'
) COMMENT '收藏事實表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_favor_info/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)分區規划

3)數據裝載

(1)首日裝載
insert overwrite table dwd_favor_info partition(dt='2020-06-14')
select
id,
user_id,
sku_id,
spu_id,
is_cancel,
create_time,
cancel_time
from ods_favor_info
where dt='2020-06-14';
(2)每日裝載
insert overwrite table dwd_favor_info partition(dt='2020-06-15')
select
id,
user_id,
sku_id,
spu_id,
is_cancel,
create_time,
cancel_time
from ods_favor_info
where dt='2020-06-15';
3.2.6 優惠券領用事實表(累積型快照事實表)
優惠券的使用有一定的生命周期:領取優惠券->使用優惠券下單->優惠券參與支付。所以優惠券領用事實表符合累積型快照事實表的特征,即將優惠券的領用、下單使用、支付使用三個時間節點按照快照進行記錄。
源自的表:
ods_coupon_use 優惠券信息表
1)建表語句
DROP TABLE IF EXISTS dwd_coupon_use;
CREATE EXTERNAL TABLE dwd_coupon_use(
`id` STRING COMMENT '編號',
`coupon_id` STRING COMMENT '優惠券ID',
`user_id` STRING COMMENT 'userid',
`order_id` STRING COMMENT '訂單id',
`coupon_status` STRING COMMENT '優惠券狀態',
`get_time` STRING COMMENT '領取時間',
`using_time` STRING COMMENT '使用時間(下單)',
`used_time` STRING COMMENT '使用時間(支付)',
`expire_time` STRING COMMENT '過期時間'
) COMMENT '優惠券領用事實表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_coupon_use/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)分區規划

3)數據裝載

(1)首日裝載
insert overwrite table dwd_coupon_use partition(dt)
select
id,
coupon_id,
user_id,
order_id,
coupon_status,
get_time,
using_time,
used_time,
expire_time,
coalesce(date_format(used_time,'yyyy-MM-dd'),date_format(expire_time,'yyyy-MM-dd'),'9999-99-99')
from ods_coupon_use
where dt='2020-06-14';
(2)每日裝載
a.裝載邏輯
b.轉載語句
insert overwrite table dwd_coupon_use partition(dt)
select
-- 如果沒有新數據,就用舊數據,否則就用新數據
nvl(new.id,old.id),
nvl(new.coupon_id,old.coupon_id),
nvl(new.user_id,old.user_id),
nvl(new.order_id,old.order_id),
nvl(new.coupon_status,old.coupon_status),
nvl(new.get_time,old.get_time),
nvl(new.using_time,old.using_time),
nvl(new.used_time,old.used_time),
nvl(new.expire_time,old.expire_time),
coalesce(date_format(nvl(new.used_time,old.used_time),'yyyy-MM-dd'),date_format(nvl(new.expire_time,old.expire_time),'yyyy-MM-dd'),'9999-99-99')
from
(
select
id,
coupon_id,
user_id,
order_id,
coupon_status,
get_time,
using_time,
used_time,
expire_time
from dwd_coupon_use
where dt='9999-99-99'
)old
full outer join
(
select
id,
coupon_id,
user_id,
order_id,
coupon_status,
get_time,
using_time,
used_time,
expire_time
from ods_coupon_use
where dt='2020-06-15'
)new
on old.id=new.id;
3.2.7 支付事實表(累積型快照事實表)
源自的表:
ods_payment_info 支付表
ods_order_info 訂單表
1)建表語句
DROP TABLE IF EXISTS dwd_payment_info;
CREATE EXTERNAL TABLE dwd_payment_info (
`id` STRING COMMENT '編號',
`order_id` STRING COMMENT '訂單編號',
`user_id` STRING COMMENT '用戶編號',
`province_id` STRING COMMENT '地區ID',
`trade_no` STRING COMMENT '交易編號',
`out_trade_no` STRING COMMENT '對外交易編號',
`payment_type` STRING COMMENT '支付類型',
`payment_amount` DECIMAL(16,2) COMMENT '支付金額',
`payment_status` STRING COMMENT '支付狀態',
`create_time` STRING COMMENT '創建時間',-- 調用第三方支付接口的時間
`callback_time` STRING COMMENT '完成時間'-- 支付完成時間,即支付成功回調時間
) COMMENT '支付事實表表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_payment_info/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)分區規划

3)數據裝載

(1)首日裝載
insert overwrite table dwd_payment_info partition(dt)
select
pi.id,
pi.order_id,
pi.user_id,
oi.province_id,
pi.trade_no,
pi.out_trade_no,
pi.payment_type,
pi.payment_amount,
pi.payment_status,
pi.create_time,
pi.callback_time,
nvl(date_format(pi.callback_time,'yyyy-MM-dd'),'9999-99-99')
from
(
select * from ods_payment_info where dt='2020-06-14'
)pi
left join
(
select id,province_id from ods_order_info where dt='2020-06-14'
)oi
on pi.order_id=oi.id;
(2)每日裝載
insert overwrite table dwd_payment_info partition(dt)
select
nvl(new.id,old.id),
nvl(new.order_id,old.order_id),
nvl(new.user_id,old.user_id),
nvl(new.province_id,old.province_id),
nvl(new.trade_no,old.trade_no),
nvl(new.out_trade_no,old.out_trade_no),
nvl(new.payment_type,old.payment_type),
nvl(new.payment_amount,old.payment_amount),
nvl(new.payment_status,old.payment_status),
nvl(new.create_time,old.create_time),
nvl(new.callback_time,old.callback_time),
nvl(date_format(nvl(new.callback_time,old.callback_time),'yyyy-MM-dd'),'9999-99-99')
from
(
select id,
order_id,
user_id,
province_id,
trade_no,
out_trade_no,
payment_type,
payment_amount,
payment_status,
create_time,
callback_time
from dwd_payment_info
where dt = '9999-99-99'
)old
full outer join
(
select
pi.id,
pi.out_trade_no,
pi.order_id,
pi.user_id,
oi.province_id,
pi.payment_type,
pi.trade_no,
pi.payment_amount,
pi.payment_status,
pi.create_time,
pi.callback_time
from
(
select * from ods_payment_info where dt='2020-06-15'
)pi
left join
(
select id,province_id from ods_order_info where dt='2020-06-15'
)oi
on pi.order_id=oi.id
)new
on old.id=new.id;
3.2.8 退款事實表(累積型快照事實表)
1)建表語句
DROP TABLE IF EXISTS dwd_refund_payment;
CREATE EXTERNAL TABLE dwd_refund_payment (
`id` STRING COMMENT '編號',
`user_id` STRING COMMENT '用戶ID',
`order_id` STRING COMMENT '訂單編號',
`sku_id` STRING COMMENT 'SKU編號',
`province_id` STRING COMMENT '地區ID',
`trade_no` STRING COMMENT '交易編號',
`out_trade_no` STRING COMMENT '對外交易編號',
`payment_type` STRING COMMENT '支付類型',
`refund_amount` DECIMAL(16,2) COMMENT '退款金額',
`refund_status` STRING COMMENT '退款狀態',
`create_time` STRING COMMENT '創建時間',--調用第三方支付接口的時間
`callback_time` STRING COMMENT '回調時間'--支付接口回調時間,即支付成功時間
) COMMENT '退款事實表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_refund_payment/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)分區規划

3)數據裝載

(1)首日裝載
insert overwrite table dwd_refund_payment partition(dt)
select
rp.id,
user_id,
order_id,
sku_id,
province_id,
trade_no,
out_trade_no,
payment_type,
refund_amount,
refund_status,
create_time,
callback_time,
nvl(date_format(callback_time,'yyyy-MM-dd'),'9999-99-99')
from
(
select
id,
out_trade_no,
order_id,
sku_id,
payment_type,
trade_no,
refund_amount,
refund_status,
create_time,
callback_time
from ods_refund_payment
where dt='2020-06-14'
)rp
left join
(
select
id,
user_id,
province_id
from ods_order_info
where dt='2020-06-14'
)oi
on rp.order_id=oi.id;
(2)每日裝載
insert overwrite table dwd_refund_payment partition(dt)
select
nvl(new.id,old.id),
nvl(new.user_id,old.user_id),
nvl(new.order_id,old.order_id),
nvl(new.sku_id,old.sku_id),
nvl(new.province_id,old.province_id),
nvl(new.trade_no,old.trade_no),
nvl(new.out_trade_no,old.out_trade_no),
nvl(new.payment_type,old.payment_type),
nvl(new.refund_amount,old.refund_amount),
nvl(new.refund_status,old.refund_status),
nvl(new.create_time,old.create_time),
nvl(new.callback_time,old.callback_time),
nvl(date_format(nvl(new.callback_time,old.callback_time),'yyyy-MM-dd'),'9999-99-99')
from
(
select
id,
user_id,
order_id,
sku_id,
province_id,
trade_no,
out_trade_no,
payment_type,
refund_amount,
refund_status,
create_time,
callback_time
from dwd_refund_payment
where dt='9999-99-99'
)old
full outer join
(
select
rp.id,
user_id,
order_id,
sku_id,
province_id,
trade_no,
out_trade_no,
payment_type,
refund_amount,
refund_status,
create_time,
callback_time
from
(
select
id,
out_trade_no,
order_id,
sku_id,
payment_type,
trade_no,
refund_amount,
refund_status,
create_time,
callback_time
from ods_refund_payment
where dt='2020-06-15'
)rp
left join
(
select
id,
user_id,
province_id
from ods_order_info
where dt='2020-06-15'
)oi
on rp.order_id=oi.id
)new
on old.id=new.id;
3)查詢加載結果
3.2.9 訂單事實表(累積型快照事實表)
源自的表:
ods_order_info 訂單表
ods_order_status_log 訂單狀態日志表
數據導入過程中涉及函數
(1)concat()函數。用於連接字符串,在;連接字符串時,只要其中一個字符串時NULL,結果就返回NULL。
(2)concat_ws()函數。同樣用於連接字符串,在連接字符串時,只要有一個字符串不是NULL,結果就不會返回NULL。同時需要指定分隔符。
(3)str_to_map()函數。
- 語法描述
STR_TO_MAP(VARCHAR text, VARCHAR listDelimiter, VARCHAR keyValueDelimiter)
- 功能描述
使用listDelimiter將text分隔成K-V對,然后使用keyValueDelimiter分隔每個K-V對,組裝成MAP返回。默認listDelimiter為( ,),keyValueDelimiter為(=)。
- 案例
str_to_map('1001=2020-06-14,1002=2020-06-14', ',' , '=')
- 輸出
{"1001":"2020-06-14","1002":"2020-06-14"}
1)建表語句
訂單事實表與時間、用戶、地區、活動四個維度有關。
訂單從創建到完成具有一定的生命周期,這個生命周期為創建->支付->取消->完成->退款->退款完成,
由於ODS層的訂單表只有創建時間和操作時間兩個狀態,不能表達所有時間節點,所以需要關聯訂單狀態表。
DROP TABLE IF EXISTS dwd_order_info;
CREATE EXTERNAL TABLE dwd_order_info(
`id` STRING COMMENT '編號',
`order_status` STRING COMMENT '訂單狀態',
`user_id` STRING COMMENT '用戶ID',
`province_id` STRING COMMENT '地區ID',
`payment_way` STRING COMMENT '支付方式',
`delivery_address` STRING COMMENT '郵寄地址',
`out_trade_no` STRING COMMENT '對外交易編號',
`tracking_no` STRING COMMENT '物流單號',
`create_time` STRING COMMENT '創建時間(未支付狀態)',
`payment_time` STRING COMMENT '支付時間(已支付狀態)',
`cancel_time` STRING COMMENT '取消時間(已取消狀態)',
`finish_time` STRING COMMENT '完成時間(已完成狀態)',
`refund_time` STRING COMMENT '退款時間(退款中狀態)',
`refund_finish_time` STRING COMMENT '退款完成時間(退款完成狀態)',
`expire_time` STRING COMMENT '過期時間',
`feight_fee` DECIMAL(16,2) COMMENT '運費',
`feight_fee_reduce` DECIMAL(16,2) COMMENT '運費減免',
`activity_reduce_amount` DECIMAL(16,2) COMMENT '活動減免',
`coupon_reduce_amount` DECIMAL(16,2) COMMENT '優惠券減免',
`original_amount` DECIMAL(16,2) COMMENT '訂單原始價格',
`final_amount` DECIMAL(16,2) COMMENT '訂單最終價格'
) COMMENT '訂單事實表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwd/dwd_order_info/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)分區規划

3)數據裝載

(1)首日裝載
將訂單狀態表中的多條數據轉換為一行map
str_to_map(concat_ws(',',collect_set(concat(order_status,'=',operate_time))),',','=') ts
訂單編號
| 訂單編號 | 訂單狀態 | 創建時間 |
|---|---|---|
| 3210 | 1001=2020-03-10 | 00:00:00.0 |
| 3210 | 1002=2020-03-10 | 00:00:00.0 |
| 3210 | 1005=2020-03-10 | 00:00:00.0 |
轉為:
{"1001":"2020-03-10 00:00:00.0","1002":"2020-03-10 00:00:00.0","1005":"2020-03-10 00:00:00.0"}
insert overwrite table dwd_order_info partition(dt)
select
oi.id,
oi.order_status,
oi.user_id,
oi.province_id,
oi.payment_way,
oi.delivery_address,
oi.out_trade_no,
oi.tracking_no,
oi.create_time,
times.ts['1002'] payment_time,
times.ts['1003'] cancel_time,
times.ts['1004'] finish_time,
times.ts['1005'] refund_time,
times.ts['1006'] refund_finish_time,
oi.expire_time,
feight_fee,
feight_fee_reduce,
activity_reduce_amount,
coupon_reduce_amount,
original_amount,
final_amount,
case
when times.ts['1003'] is not null then date_format(times.ts['1003'],'yyyy-MM-dd')
when times.ts['1004'] is not null and date_add(date_format(times.ts['1004'],'yyyy-MM-dd'),7)<='2020-06-14' and times.ts['1005'] is null then date_add(date_format(times.ts['1004'],'yyyy-MM-dd'),7)
when times.ts['1006'] is not null then date_format(times.ts['1006'],'yyyy-MM-dd')
when oi.expire_time is not null then date_format(oi.expire_time,'yyyy-MM-dd')
else '9999-99-99'
end
from
(
select
*
from ods_order_info
where dt='2020-06-14'
)oi
left join
(
select
order_id,
str_to_map(concat_ws(',',collect_set(concat(order_status,'=',operate_time))),',','=') ts
from ods_order_status_log
where dt='2020-06-14'
group by order_id
)times
on oi.id=times.order_id;
(2)每日裝載
insert overwrite table dwd_order_info partition(dt)
select
nvl(new.id,old.id),
nvl(new.order_status,old.order_status),
nvl(new.user_id,old.user_id),
nvl(new.province_id,old.province_id),
nvl(new.payment_way,old.payment_way),
nvl(new.delivery_address,old.delivery_address),
nvl(new.out_trade_no,old.out_trade_no),
nvl(new.tracking_no,old.tracking_no),
nvl(new.create_time,old.create_time),
nvl(new.payment_time,old.payment_time),
nvl(new.cancel_time,old.cancel_time),
nvl(new.finish_time,old.finish_time),
nvl(new.refund_time,old.refund_time),
nvl(new.refund_finish_time,old.refund_finish_time),
nvl(new.expire_time,old.expire_time),
nvl(new.feight_fee,old.feight_fee),
nvl(new.feight_fee_reduce,old.feight_fee_reduce),
nvl(new.activity_reduce_amount,old.activity_reduce_amount),
nvl(new.coupon_reduce_amount,old.coupon_reduce_amount),
nvl(new.original_amount,old.original_amount),
nvl(new.final_amount,old.final_amount),
case
when new.cancel_time is not null then date_format(new.cancel_time,'yyyy-MM-dd')
when new.finish_time is not null and date_add(date_format(new.finish_time,'yyyy-MM-dd'),7)='2020-06-15' and new.refund_time is null then '2020-06-15'
when new.refund_finish_time is not null then date_format(new.refund_finish_time,'yyyy-MM-dd')
when new.expire_time is not null then date_format(new.expire_time,'yyyy-MM-dd')
else '9999-99-99'
end
from
(
select
id,
order_status,
user_id,
province_id,
payment_way,
delivery_address,
out_trade_no,
tracking_no,
create_time,
payment_time,
cancel_time,
finish_time,
refund_time,
refund_finish_time,
expire_time,
feight_fee,
feight_fee_reduce,
activity_reduce_amount,
coupon_reduce_amount,
original_amount,
final_amount
from dwd_order_info
where dt='9999-99-99'
)old
full outer join
(
select
oi.id,
oi.order_status,
oi.user_id,
oi.province_id,
oi.payment_way,
oi.delivery_address,
oi.out_trade_no,
oi.tracking_no,
oi.create_time,
times.ts['1002'] payment_time,
times.ts['1003'] cancel_time,
times.ts['1004'] finish_time,
times.ts['1005'] refund_time,
times.ts['1006'] refund_finish_time,
oi.expire_time,
feight_fee,
feight_fee_reduce,
activity_reduce_amount,
coupon_reduce_amount,
original_amount,
final_amount
from
(
select
*
from ods_order_info
where dt='2020-06-15'
)oi
left join
(
select
order_id,
str_to_map(concat_ws(',',collect_set(concat(order_status,'=',operate_time))),',','=') ts
from ods_order_status_log
where dt='2020-06-15'
group by order_id
)times
on oi.id=times.order_id
)new
on old.id=new.id;
3.2.10 DWD層業務數據首日裝載腳本
1)編寫腳本
(1)在/home/atguigu/bin目錄下創建腳本ods_to_dwd_db_init.sh
[atguigu@hadoop102 bin]$ vim ods_to_dwd_db_init.sh
在腳本中填寫如下內容
#!/bin/bash
APP=gmall
if [ -n "$2" ] ;then
do_date=$2
else
echo "請傳入日期參數"
exit
fi
dwd_order_info="
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_order_info partition(dt)
select
oi.id,
oi.order_status,
oi.user_id,
oi.province_id,
oi.payment_way,
oi.delivery_address,
oi.out_trade_no,
oi.tracking_no,
oi.create_time,
times.ts['1002'] payment_time,
times.ts['1003'] cancel_time,
times.ts['1004'] finish_time,
times.ts['1005'] refund_time,
times.ts['1006'] refund_finish_time,
oi.expire_time,
feight_fee,
feight_fee_reduce,
activity_reduce_amount,
coupon_reduce_amount,
original_amount,
final_amount,
case
when times.ts['1003'] is not null then date_format(times.ts['1003'],'yyyy-MM-dd')
when times.ts['1004'] is not null and date_add(date_format(times.ts['1004'],'yyyy-MM-dd'),7)<='$do_date' and times.ts['1005'] is null then date_add(date_format(times.ts['1004'],'yyyy-MM-dd'),7)
when times.ts['1006'] is not null then date_format(times.ts['1006'],'yyyy-MM-dd')
when oi.expire_time is not null then date_format(oi.expire_time,'yyyy-MM-dd')
else '9999-99-99'
end
from
(
select
*
from ${APP}.ods_order_info
where dt='$do_date'
)oi
left join
(
select
order_id,
str_to_map(concat_ws(',',collect_set(concat(order_status,'=',operate_time))),',','=') ts
from ${APP}.ods_order_status_log
where dt='$do_date'
group by order_id
)times
on oi.id=times.order_id;"
dwd_order_detail="
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_order_detail partition(dt)
select
od.id,
od.order_id,
oi.user_id,
od.sku_id,
oi.province_id,
oda.activity_id,
oda.activity_rule_id,
odc.coupon_id,
od.create_time,
od.source_type,
od.source_id,
od.sku_num,
od.order_price*od.sku_num,
od.split_activity_amount,
od.split_coupon_amount,
od.split_final_amount,
date_format(create_time,'yyyy-MM-dd')
from
(
select
*
from ${APP}.ods_order_detail
where dt='$do_date'
)od
left join
(
select
id,
user_id,
province_id
from ${APP}.ods_order_info
where dt='$do_date'
)oi
on od.order_id=oi.id
left join
(
select
order_detail_id,
activity_id,
activity_rule_id
from ${APP}.ods_order_detail_activity
where dt='$do_date'
)oda
on od.id=oda.order_detail_id
left join
(
select
order_detail_id,
coupon_id
from ${APP}.ods_order_detail_coupon
where dt='$do_date'
)odc
on od.id=odc.order_detail_id;"
dwd_payment_info="
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_payment_info partition(dt)
select
pi.id,
pi.order_id,
pi.user_id,
oi.province_id,
pi.trade_no,
pi.out_trade_no,
pi.payment_type,
pi.payment_amount,
pi.payment_status,
pi.create_time,
pi.callback_time,
nvl(date_format(pi.callback_time,'yyyy-MM-dd'),'9999-99-99')
from
(
select * from ${APP}.ods_payment_info where dt='$do_date'
)pi
left join
(
select id,province_id from ${APP}.ods_order_info where dt='$do_date'
)oi
on pi.order_id=oi.id;"
dwd_cart_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_cart_info partition(dt='$do_date')
select
id,
user_id,
sku_id,
source_type,
source_id,
cart_price,
is_ordered,
create_time,
operate_time,
order_time,
sku_num
from ${APP}.ods_cart_info
where dt='$do_date';"
dwd_comment_info="
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_comment_info partition(dt)
select
id,
user_id,
sku_id,
spu_id,
order_id,
appraise,
create_time,
date_format(create_time,'yyyy-MM-dd')
from ${APP}.ods_comment_info
where dt='$do_date';
"
dwd_favor_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_favor_info partition(dt='$do_date')
select
id,
user_id,
sku_id,
spu_id,
is_cancel,
create_time,
cancel_time
from ${APP}.ods_favor_info
where dt='$do_date';"
dwd_coupon_use="
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_coupon_use partition(dt)
select
id,
coupon_id,
user_id,
order_id,
coupon_status,
get_time,
using_time,
used_time,
expire_time,
coalesce(date_format(used_time,'yyyy-MM-dd'),date_format(expire_time,'yyyy-MM-dd'),'9999-99-99')
from ${APP}.ods_coupon_use
where dt='$do_date';"
dwd_order_refund_info="
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_order_refund_info partition(dt)
select
ri.id,
ri.user_id,
ri.order_id,
ri.sku_id,
oi.province_id,
ri.refund_type,
ri.refund_num,
ri.refund_amount,
ri.refund_reason_type,
ri.create_time,
date_format(ri.create_time,'yyyy-MM-dd')
from
(
select * from ${APP}.ods_order_refund_info where dt='$do_date'
)ri
left join
(
select id,province_id from ${APP}.ods_order_info where dt='$do_date'
)oi
on ri.order_id=oi.id;"
dwd_refund_payment="
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_refund_payment partition(dt)
select
rp.id,
user_id,
order_id,
sku_id,
province_id,
trade_no,
out_trade_no,
payment_type,
refund_amount,
refund_status,
create_time,
callback_time,
nvl(date_format(callback_time,'yyyy-MM-dd'),'9999-99-99')
from
(
select
id,
out_trade_no,
order_id,
sku_id,
payment_type,
trade_no,
refund_amount,
refund_status,
create_time,
callback_time
from ${APP}.ods_refund_payment
where dt='$do_date'
)rp
left join
(
select
id,
user_id,
province_id
from ${APP}.ods_order_info
where dt='$do_date'
)oi
on rp.order_id=oi.id;"
case $1 in
dwd_order_info )
hive -e "$dwd_order_info"
;;
dwd_order_detail )
hive -e "$dwd_order_detail"
;;
dwd_payment_info )
hive -e "$dwd_payment_info"
;;
dwd_cart_info )
hive -e "$dwd_cart_info"
;;
dwd_comment_info )
hive -e "$dwd_comment_info"
;;
dwd_favor_info )
hive -e "$dwd_favor_info"
;;
dwd_coupon_use )
hive -e "$dwd_coupon_use"
;;
dwd_order_refund_info )
hive -e "$dwd_order_refund_info"
;;
dwd_refund_payment )
hive -e "$dwd_refund_payment"
;;
all )
hive -e "$dwd_order_info$dwd_order_detail$dwd_payment_info$dwd_cart_info$dwd_comment_info$dwd_favor_info$dwd_coupon_use$dwd_order_refund_info$dwd_refund_payment"
;;
esac
(2)增加執行權限
[atguigu@hadoop102 bin]$ chmod +x ods_to_dwd_db_init.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 bin]$ ods_to_dwd_db_init.sh all 2020-06-14
(2)查看數據是否導入成功
3.2.11 DWD層業務數據每日裝載腳本
1)編寫腳本
(1)在/home/atguigu/bin目錄下創建腳本ods_to_dwd_db.sh
[atguigu@hadoop102 bin]$ vim ods_to_dwd_db.sh
在腳本中填寫如下內容
#!/bin/bash
APP=gmall
# 如果是輸入的日期按照取輸入日期;如果沒輸入日期取當前時間的前一天
if [ -n "$2" ] ;then
do_date=$2
else
do_date=`date -d "-1 day" +%F`
fi
# 假設某累積型快照事實表,某天所有的業務記錄全部完成,則會導致9999-99-99分區的數據未被覆蓋,從而導致數據重復,該函數根據9999-99-99分區的數據的末次修改時間判斷其是否被覆蓋了,如果未被覆蓋,就手動清理
clear_data(){
current_date=`date +%F`
current_date_timestamp=`date -d "$current_date" +%s`
last_modified_date=`hadoop fs -ls /warehouse/gmall/dwd/$1 | grep '9999-99-99' | awk '{print $6}'`
last_modified_date_timestamp=`date -d "$last_modified_date" +%s`
if [[ $last_modified_date_timestamp -lt $current_date_timestamp ]]; then
echo "clear table $1 partition(dt=9999-99-99)"
hadoop fs -rm -r -f /warehouse/gmall/dwd/$1/dt=9999-99-99/*
fi
}
dwd_order_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table ${APP}.dwd_order_info partition(dt)
select
nvl(new.id,old.id),
nvl(new.order_status,old.order_status),
nvl(new.user_id,old.user_id),
nvl(new.province_id,old.province_id),
nvl(new.payment_way,old.payment_way),
nvl(new.delivery_address,old.delivery_address),
nvl(new.out_trade_no,old.out_trade_no),
nvl(new.tracking_no,old.tracking_no),
nvl(new.create_time,old.create_time),
nvl(new.payment_time,old.payment_time),
nvl(new.cancel_time,old.cancel_time),
nvl(new.finish_time,old.finish_time),
nvl(new.refund_time,old.refund_time),
nvl(new.refund_finish_time,old.refund_finish_time),
nvl(new.expire_time,old.expire_time),
nvl(new.feight_fee,old.feight_fee),
nvl(new.feight_fee_reduce,old.feight_fee_reduce),
nvl(new.activity_reduce_amount,old.activity_reduce_amount),
nvl(new.coupon_reduce_amount,old.coupon_reduce_amount),
nvl(new.original_amount,old.original_amount),
nvl(new.final_amount,old.final_amount),
case
when new.cancel_time is not null then date_format(new.cancel_time,'yyyy-MM-dd')
when new.finish_time is not null and date_add(date_format(new.finish_time,'yyyy-MM-dd'),7)='$do_date' and new.refund_time is null then '$do_date'
when new.refund_finish_time is not null then date_format(new.refund_finish_time,'yyyy-MM-dd')
when new.expire_time is not null then date_format(new.expire_time,'yyyy-MM-dd')
else '9999-99-99'
end
from
(
select
id,
order_status,
user_id,
province_id,
payment_way,
delivery_address,
out_trade_no,
tracking_no,
create_time,
payment_time,
cancel_time,
finish_time,
refund_time,
refund_finish_time,
expire_time,
feight_fee,
feight_fee_reduce,
activity_reduce_amount,
coupon_reduce_amount,
original_amount,
final_amount
from ${APP}.dwd_order_info
where dt='9999-99-99'
)old
full outer join
(
select
oi.id,
oi.order_status,
oi.user_id,
oi.province_id,
oi.payment_way,
oi.delivery_address,
oi.out_trade_no,
oi.tracking_no,
oi.create_time,
times.ts['1002'] payment_time,
times.ts['1003'] cancel_time,
times.ts['1004'] finish_time,
times.ts['1005'] refund_time,
times.ts['1006'] refund_finish_time,
oi.expire_time,
feight_fee,
feight_fee_reduce,
activity_reduce_amount,
coupon_reduce_amount,
original_amount,
final_amount
from
(
select
*
from ${APP}.ods_order_info
where dt='$do_date'
)oi
left join
(
select
order_id,
str_to_map(concat_ws(',',collect_set(concat(order_status,'=',operate_time))),',','=') ts
from ${APP}.ods_order_status_log
where dt='$do_date'
group by order_id
)times
on oi.id=times.order_id
)new
on old.id=new.id;"
dwd_order_detail="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_order_detail partition(dt='$do_date')
select
od.id,
od.order_id,
oi.user_id,
od.sku_id,
oi.province_id,
oda.activity_id,
oda.activity_rule_id,
odc.coupon_id,
od.create_time,
od.source_type,
od.source_id,
od.sku_num,
od.order_price*od.sku_num,
od.split_activity_amount,
od.split_coupon_amount,
od.split_final_amount
from
(
select
*
from ${APP}.ods_order_detail
where dt='$do_date'
)od
left join
(
select
id,
user_id,
province_id
from ${APP}.ods_order_info
where dt='$do_date'
)oi
on od.order_id=oi.id
left join
(
select
order_detail_id,
activity_id,
activity_rule_id
from ${APP}.ods_order_detail_activity
where dt='$do_date'
)oda
on od.id=oda.order_detail_id
left join
(
select
order_detail_id,
coupon_id
from ${APP}.ods_order_detail_coupon
where dt='$do_date'
)odc
on od.id=odc.order_detail_id;"
dwd_payment_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table ${APP}.dwd_payment_info partition(dt)
select
nvl(new.id,old.id),
nvl(new.order_id,old.order_id),
nvl(new.user_id,old.user_id),
nvl(new.province_id,old.province_id),
nvl(new.trade_no,old.trade_no),
nvl(new.out_trade_no,old.out_trade_no),
nvl(new.payment_type,old.payment_type),
nvl(new.payment_amount,old.payment_amount),
nvl(new.payment_status,old.payment_status),
nvl(new.create_time,old.create_time),
nvl(new.callback_time,old.callback_time),
nvl(date_format(nvl(new.callback_time,old.callback_time),'yyyy-MM-dd'),'9999-99-99')
from
(
select id,
order_id,
user_id,
province_id,
trade_no,
out_trade_no,
payment_type,
payment_amount,
payment_status,
create_time,
callback_time
from ${APP}.dwd_payment_info
where dt = '9999-99-99'
)old
full outer join
(
select
pi.id,
pi.out_trade_no,
pi.order_id,
pi.user_id,
oi.province_id,
pi.payment_type,
pi.trade_no,
pi.payment_amount,
pi.payment_status,
pi.create_time,
pi.callback_time
from
(
select * from ${APP}.ods_payment_info where dt='$do_date'
)pi
left join
(
select id,province_id from ${APP}.ods_order_info where dt='$do_date'
)oi
on pi.order_id=oi.id
)new
on old.id=new.id;"
dwd_cart_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_cart_info partition(dt='$do_date')
select
id,
user_id,
sku_id,
source_type,
source_id,
cart_price,
is_ordered,
create_time,
operate_time,
order_time,
sku_num
from ${APP}.ods_cart_info
where dt='$do_date';"
dwd_comment_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_comment_info partition(dt='$do_date')
select
id,
user_id,
sku_id,
spu_id,
order_id,
appraise,
create_time
from ${APP}.ods_comment_info where dt='$do_date';"
dwd_favor_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_favor_info partition(dt='$do_date')
select
id,
user_id,
sku_id,
spu_id,
is_cancel,
create_time,
cancel_time
from ${APP}.ods_favor_info
where dt='$do_date';"
dwd_coupon_use="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table ${APP}.dwd_coupon_use partition(dt)
select
nvl(new.id,old.id),
nvl(new.coupon_id,old.coupon_id),
nvl(new.user_id,old.user_id),
nvl(new.order_id,old.order_id),
nvl(new.coupon_status,old.coupon_status),
nvl(new.get_time,old.get_time),
nvl(new.using_time,old.using_time),
nvl(new.used_time,old.used_time),
nvl(new.expire_time,old.expire_time),
coalesce(date_format(nvl(new.used_time,old.used_time),'yyyy-MM-dd'),date_format(nvl(new.expire_time,old.expire_time),'yyyy-MM-dd'),'9999-99-99')
from
(
select
id,
coupon_id,
user_id,
order_id,
coupon_status,
get_time,
using_time,
used_time,
expire_time
from ${APP}.dwd_coupon_use
where dt='9999-99-99'
)old
full outer join
(
select
id,
coupon_id,
user_id,
order_id,
coupon_status,
get_time,
using_time,
used_time,
expire_time
from ${APP}.ods_coupon_use
where dt='$do_date'
)new
on old.id=new.id;"
dwd_order_refund_info="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_order_refund_info partition(dt='$do_date')
select
ri.id,
ri.user_id,
ri.order_id,
ri.sku_id,
oi.province_id,
ri.refund_type,
ri.refund_num,
ri.refund_amount,
ri.refund_reason_type,
ri.create_time
from
(
select * from ${APP}.ods_order_refund_info where dt='$do_date'
)ri
left join
(
select id,province_id from ${APP}.ods_order_info where dt='$do_date'
)oi
on ri.order_id=oi.id;"
dwd_refund_payment="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table ${APP}.dwd_refund_payment partition(dt)
select
nvl(new.id,old.id),
nvl(new.user_id,old.user_id),
nvl(new.order_id,old.order_id),
nvl(new.sku_id,old.sku_id),
nvl(new.province_id,old.province_id),
nvl(new.trade_no,old.trade_no),
nvl(new.out_trade_no,old.out_trade_no),
nvl(new.payment_type,old.payment_type),
nvl(new.refund_amount,old.refund_amount),
nvl(new.refund_status,old.refund_status),
nvl(new.create_time,old.create_time),
nvl(new.callback_time,old.callback_time),
nvl(date_format(nvl(new.callback_time,old.callback_time),'yyyy-MM-dd'),'9999-99-99')
from
(
select
id,
user_id,
order_id,
sku_id,
province_id,
trade_no,
out_trade_no,
payment_type,
refund_amount,
refund_status,
create_time,
callback_time
from ${APP}.dwd_refund_payment
where dt='9999-99-99'
)old
full outer join
(
select
rp.id,
user_id,
order_id,
sku_id,
province_id,
trade_no,
out_trade_no,
payment_type,
refund_amount,
refund_status,
create_time,
callback_time
from
(
select
id,
out_trade_no,
order_id,
sku_id,
payment_type,
trade_no,
refund_amount,
refund_status,
create_time,
callback_time
from ${APP}.ods_refund_payment
where dt='$do_date'
)rp
left join
(
select
id,
user_id,
province_id
from ${APP}.ods_order_info
where dt='$do_date'
)oi
on rp.order_id=oi.id
)new
on old.id=new.id;"
case $1 in
dwd_order_info )
hive -e "$dwd_order_info"
clear_data dwd_order_info
;;
dwd_order_detail )
hive -e "$dwd_order_detail"
;;
dwd_payment_info )
hive -e "$dwd_payment_info"
clear_data dwd_payment_info
;;
dwd_cart_info )
hive -e "$dwd_cart_info"
;;
dwd_comment_info )
hive -e "$dwd_comment_info"
;;
dwd_favor_info )
hive -e "$dwd_favor_info"
;;
dwd_coupon_use )
hive -e "$dwd_coupon_use"
clear_data dwd_coupon_use
;;
dwd_order_refund_info )
hive -e "$dwd_order_refund_info"
;;
dwd_refund_payment )
hive -e "$dwd_refund_payment"
clear_data dwd_refund_payment
;;
all )
hive -e "$dwd_order_info$dwd_order_detail$dwd_payment_info$dwd_cart_info$dwd_comment_info$dwd_favor_info$dwd_coupon_use$dwd_order_refund_info$dwd_refund_payment"
clear_data dwd_order_info
clear_data dwd_payment_info
clear_data dwd_coupon_use
clear_data dwd_refund_payment
;;
esac
(2)增加腳本執行權限
[atguigu@hadoop102 bin]$ chmod 777 ods_to_dwd_db.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 bin]$ ods_to_dwd_db.sh all 2020-06-14
(2)查看數據是否導入成功
第四章 數倉搭建-DWS層
DWS層采用寬表化手段,構建公共指標數據。其站在不同主題的角度,將數據進行匯總和聚合,得到每天每個主題的相關數據。
4.1 系統函數
4.1.1 nvl函數
1)基本語法
NVL(表達式1,表達式2)
如果表達式1為空值,NVL返回值為表達式2的值,否則返回表達式1的值。
該函數的目的是把一個空值(null)轉換成一個實際的值。其表達式的值可以是數字型、字符型和日期型。但是表達式1和表達式2的數據類型必須為同一個類型。
2)案例實操
hive (gmall)> select nvl(1,0);
1
hive (gmall)> select nvl(null,"hello");
hello
4.1.2 日期處理函數
1)date_format函數(根據格式整理日期)
hive (gmall)> select date_format('2020-06-14','yyyy-MM');
2020-06
2)date_add函數(加減日期)
hive (gmall)> select date_add('2020-06-14',-1);
2020-06-13
hive (gmall)> select date_add('2020-06-14',1);
2020-06-15
3)next_day函數
- 取當前天的下一個周一
hive (gmall)> select next_day('2020-06-14','MO');
2020-06-15
說明:星期一到星期日的英文(Monday,Tuesday、Wednesday、Thursday、Friday、Saturday、Sunday)
- 取當前周的周一
hive (gmall)> select date_add(next_day('2020-06-14','MO'),-7);
2020-06-8
4)last_day函數(求當月最后一天日期)
hive (gmall)> select last_day('2020-06-14');
2020-06-30
4.1.3 復雜數據類型定義
1)map結構數據定義
map<string,string>
2)array結構數據定義
array<string>
3)struct結構數據定義
struct<id:int,name:string,age:int>
4)struct和array嵌套定義
array<struct<id:int,name:string,age:int>>
4.2 DWS層

DWS層表數據裝載

4.2.1 每日設備行為
出於對后續每日活躍設備、每周活躍設備、每日新增設備等需求的考慮,我們利用用戶行為DWD層的啟動日志表,按照設備id進行聚合,得到DWS層的設備行為表。
1)建表語句
DROP TABLE IF EXISTS dws_visitor_action_daycount;
CREATE EXTERNAL TABLE dws_visitor_action_daycount
(
`mid_id` STRING COMMENT '設備id',
`brand` STRING COMMENT '設備品牌',
`model` STRING COMMENT '設備型號',
`is_new` STRING COMMENT '是否首次訪問',
`channel` ARRAY<STRING> COMMENT '渠道',
`os` ARRAY<STRING> COMMENT '操作系統',
`area_code` ARRAY<STRING> COMMENT '地區ID',
`version_code` ARRAY<STRING> COMMENT '應用版本',
`visit_count` BIGINT COMMENT '訪問次數',
`page_stats` ARRAY<STRUCT<page_id:STRING,page_count:BIGINT,during_time:BIGINT>> COMMENT '頁面訪問統計'
) COMMENT '每日設備行為表'
PARTITIONED BY(`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dws/dws_visitor_action_daycount'
TBLPROPERTIES ("parquet.compression"="lzo");
2)數據裝載
insert overwrite table dws_visitor_action_daycount partition(dt='2020-06-14')
select
t1.mid_id,
t1.brand,
t1.model,
t1.is_new,
t1.channel,
t1.os,
t1.area_code,
t1.version_code,
t1.visit_count,
t3.page_stats
from
(
select
mid_id,
brand,
model,
if(array_contains(collect_set(is_new),'0'),'0','1') is_new,--ods_page_log中,同一天內,同一設備的is_new字段,可能全部為1,可能全部為0,也可能部分為0,部分為1(卸載重裝),故做該處理
collect_set(channel) channel,
collect_set(os) os,
collect_set(area_code) area_code,
collect_set(version_code) version_code,
sum(if(last_page_id is null,1,0)) visit_count
from dwd_page_log
where dt='2020-06-14'
and last_page_id is null
group by mid_id,model,brand
)t1
join
(
select
mid_id,
brand,
model,
collect_set(named_struct('page_id',page_id,'page_count',page_count,'during_time',during_time)) page_stats
from
(
select
mid_id,
brand,
model,
page_id,
count(*) page_count,
sum(during_time) during_time
from dwd_page_log
where dt='2020-06-14'
group by mid_id,model,brand,page_id
)t2
group by mid_id,model,brand
)t3
on t1.mid_id=t3.mid_id and t1.brand=t3.brand and t1.model=t3.model;
3)查詢加載結果
4.2.2 每日用戶行為
每日用戶行為表以用戶(注冊過的)為中心,關注用戶的行為,以及該行為對應的度量值。
源自的表:
dwd_page_log 頁面日志表
dwd_action_log 動作日志表
dwd_order_info 訂單事實表
dwd_payment_info 支付事實表
dwd_order_refund_info 退單事實表
dwd_refund_payment 退款事實表
dwd_order_refund_info 退單事實表
dwd_coupon_use 優惠券領用事實表
dwd_comment_info 評價事實表
dwd_order_detail 訂單明細事實表
1)建表語句
DROP TABLE IF EXISTS dws_user_action_daycount;
CREATE EXTERNAL TABLE dws_user_action_daycount
(
`user_id` STRING COMMENT '用戶id',
`login_count` BIGINT COMMENT '登錄次數',
`cart_count` BIGINT COMMENT '加入購物車次數',
`favor_count` BIGINT COMMENT '收藏次數',
`order_count` BIGINT COMMENT '下單次數',
`order_activity_count` BIGINT COMMENT '訂單參與活動次數',
`order_activity_reduce_amount` DECIMAL(16,2) COMMENT '訂單減免金額(活動)',
`order_coupon_count` BIGINT COMMENT '訂單用券次數',
`order_coupon_reduce_amount` DECIMAL(16,2) COMMENT '訂單減免金額(優惠券)',
`order_original_amount` DECIMAL(16,2) COMMENT '訂單單原始金額',
`order_final_amount` DECIMAL(16,2) COMMENT '訂單總金額',
`payment_count` BIGINT COMMENT '支付次數',
`payment_amount` DECIMAL(16,2) COMMENT '支付金額',
`refund_order_count` BIGINT COMMENT '退單次數',
`refund_order_num` BIGINT COMMENT '退單件數',
`refund_order_amount` DECIMAL(16,2) COMMENT '退單金額',
`refund_payment_count` BIGINT COMMENT '退款次數',
`refund_payment_num` BIGINT COMMENT '退款件數',
`refund_payment_amount` DECIMAL(16,2) COMMENT '退款金額',
`coupon_get_count` BIGINT COMMENT '優惠券領取次數',
`coupon_using_count` BIGINT COMMENT '優惠券使用(下單)次數',
`coupon_used_count` BIGINT COMMENT '優惠券使用(支付)次數',
`appraise_good_count` BIGINT COMMENT '好評數',
`appraise_mid_count` BIGINT COMMENT '中評數',
`appraise_bad_count` BIGINT COMMENT '差評數',
`appraise_default_count` BIGINT COMMENT '默認評價數',
`order_detail_stats` array<struct<sku_id:string,sku_num:bigint,order_count:bigint,activity_reduce_amount:decimal(16,2),coupon_reduce_amount:decimal(16,2),original_amount:decimal(16,2),final_amount:decimal(16,2)>> COMMENT '下單明細統計'
) COMMENT '每日用戶行為'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dws/dws_user_action_daycount/'
TBLPROPERTIES ("parquet.compression"="lzo");
登錄 次數要聚合自 dwd_start_log 啟動日志是移動端特有的,PC端沒有,不能使用,所以要用dwd_spage_log
2)數據裝載
(1)首日裝載
with
tmp_login as
(
select
dt,
user_id,
count(*) login_count
from dwd_page_log
where user_id is not null
and last_page_id is null
group by dt,user_id
),
tmp_cf as
(
select
dt,
user_id,
sum(if(action_id='cart_add',1,0)) cart_count,
sum(if(action_id='favor_add',1,0)) favor_count
from dwd_action_log
where user_id is not null
and action_id in ('cart_add','favor_add')
group by dt,user_id
),
tmp_order as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
user_id,
count(*) order_count,
sum(if(activity_reduce_amount>0,1,0)) order_activity_count,
sum(if(coupon_reduce_amount>0,1,0)) order_coupon_count,
sum(activity_reduce_amount) order_activity_reduce_amount,
sum(coupon_reduce_amount) order_coupon_reduce_amount,
sum(original_amount) order_original_amount,
sum(final_amount) order_final_amount
from dwd_order_info
group by date_format(create_time,'yyyy-MM-dd'),user_id
),
tmp_pay as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
user_id,
count(*) payment_count,
sum(payment_amount) payment_amount
from dwd_payment_info
group by date_format(callback_time,'yyyy-MM-dd'),user_id
),
tmp_ri as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
user_id,
count(*) refund_order_count,
sum(refund_num) refund_order_num,
sum(refund_amount) refund_order_amount
from dwd_order_refund_info
group by date_format(create_time,'yyyy-MM-dd'),user_id
),
tmp_rp as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
rp.user_id,
count(*) refund_payment_count,
sum(ri.refund_num) refund_payment_num,
sum(rp.refund_amount) refund_payment_amount
from
(
select
user_id,
order_id,
sku_id,
refund_amount,
callback_time
from dwd_refund_payment
)rp
left join
(
select
user_id,
order_id,
sku_id,
refund_num
from dwd_order_refund_info
)ri
on rp.order_id=ri.order_id
and rp.sku_id=rp.sku_id
group by date_format(callback_time,'yyyy-MM-dd'),rp.user_id
),
tmp_coupon as
(
select
coalesce(coupon_get.dt,coupon_using.dt,coupon_used.dt) dt,
coalesce(coupon_get.user_id,coupon_using.user_id,coupon_used.user_id) user_id,
nvl(coupon_get_count,0) coupon_get_count,
nvl(coupon_using_count,0) coupon_using_count,
nvl(coupon_used_count,0) coupon_used_count
from
(
select
date_format(get_time,'yyyy-MM-dd') dt,
user_id,
count(*) coupon_get_count
from dwd_coupon_use
where get_time is not null
group by user_id,date_format(get_time,'yyyy-MM-dd')
)coupon_get
full outer join
(
select
date_format(using_time,'yyyy-MM-dd') dt,
user_id,
count(*) coupon_using_count
from dwd_coupon_use
where using_time is not null
group by user_id,date_format(using_time,'yyyy-MM-dd')
)coupon_using
on coupon_get.dt=coupon_using.dt
and coupon_get.user_id=coupon_using.user_id
full outer join
(
select
date_format(used_time,'yyyy-MM-dd') dt,
user_id,
count(*) coupon_used_count
from dwd_coupon_use
where used_time is not null
group by user_id,date_format(used_time,'yyyy-MM-dd')
)coupon_used
on nvl(coupon_get.dt,coupon_using.dt)=coupon_used.dt
and nvl(coupon_get.user_id,coupon_using.user_id)=coupon_used.user_id
),
tmp_comment as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
user_id,
sum(if(appraise='1201',1,0)) appraise_good_count,
sum(if(appraise='1202',1,0)) appraise_mid_count,
sum(if(appraise='1203',1,0)) appraise_bad_count,
sum(if(appraise='1204',1,0)) appraise_default_count
from dwd_comment_info
group by date_format(create_time,'yyyy-MM-dd'),user_id
),
tmp_od as
(
select
dt,
user_id,
collect_set(named_struct('sku_id',sku_id,'sku_num',sku_num,'order_count',order_count,'activity_reduce_amount',activity_reduce_amount,'coupon_reduce_amount',coupon_reduce_amount,'original_amount',original_amount,'final_amount',final_amount)) order_detail_stats
from
(
select
date_format(create_time,'yyyy-MM-dd') dt,
user_id,
sku_id,
sum(sku_num) sku_num,
count(*) order_count,
cast(sum(split_activity_amount) as decimal(16,2)) activity_reduce_amount,
cast(sum(split_coupon_amount) as decimal(16,2)) coupon_reduce_amount,
cast(sum(original_amount) as decimal(16,2)) original_amount,
cast(sum(split_final_amount) as decimal(16,2)) final_amount
from dwd_order_detail
group by date_format(create_time,'yyyy-MM-dd'),user_id,sku_id
)t1
group by dt,user_id
)
insert overwrite table dws_user_action_daycount partition(dt)
select coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id,tmp_coupon.user_id,tmp_od.user_id),
nvl(login_count,0),
nvl(cart_count,0),
nvl(favor_count,0),
nvl(order_count,0),
nvl(order_activity_count,0),
nvl(order_activity_reduce_amount,0),
nvl(order_coupon_count,0),
nvl(order_coupon_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_count,0),
nvl(payment_amount,0),
nvl(refund_order_count,0),
nvl(refund_order_num,0),
nvl(refund_order_amount,0),
nvl(refund_payment_count,0),
nvl(refund_payment_num,0),
nvl(refund_payment_amount,0),
nvl(coupon_get_count,0),
nvl(coupon_using_count,0),
nvl(coupon_used_count,0),
nvl(appraise_good_count,0),
nvl(appraise_mid_count,0),
nvl(appraise_bad_count,0),
nvl(appraise_default_count,0),
order_detail_stats,
coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt,tmp_ri.dt,tmp_rp.dt,tmp_comment.dt,tmp_coupon.dt,tmp_od.dt)
from tmp_login
full outer join tmp_cf on tmp_login.user_id=tmp_cf.user_id and tmp_login.dt=tmp_cf.dt
full outer join tmp_order on coalesce(tmp_login.user_id,tmp_cf.user_id)=tmp_order.user_id
and coalesce(tmp_login.dt,tmp_cf.dt)=tmp_order.dt
full outer join tmp_pay on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id)=tmp_pay.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt)=tmp_pay.dt
full outer join tmp_ri on coalesce(tmp_login.user_id,tmp_cf.user_id, tmp_order.user_id,tmp_pay.user_id)=tmp_ri.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt)=tmp_ri.dt
full outer join tmp_rp on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id)=tmp_rp.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt,tmp_ri.dt)=tmp_rp.dt
full outer join tmp_comment
on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id)=tmp_comment.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt,tmp_ri.dt,tmp_rp.dt)=tmp_comment.dt
full outer join tmp_coupon
on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id)=tmp_coupon.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt,tmp_ri.dt,tmp_rp.dt,tmp_comment.dt)=tmp_coupon.dt
full outer join tmp_od
on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id,tmp_coupon.user_id)=tmp_od.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt,tmp_ri.dt,tmp_rp.dt,tmp_comment.dt,tmp_coupon.dt)=tmp_od.dt;
COALESCE是一個函數, (expression_1, expression_2, ...,expression_n)依次參考各參數表達式,遇到非null值即停止並返回該值。如果所有的表達式都是空值,最終將返回一個空值。使用COALESCE在於大部分包含空值的表達式最終將返回空值。
(2)每日裝載
with
tmp_login as
(
select
user_id,
count(*) login_count
from dwd_page_log
where dt='2020-06-15'
and user_id is not null
and last_page_id is null
group by user_id
),
tmp_cf as
(
select
user_id,
sum(if(action_id='cart_add',1,0)) cart_count,
sum(if(action_id='favor_add',1,0)) favor_count
from dwd_action_log
where dt='2020-06-15'
and user_id is not null
and action_id in ('cart_add','favor_add')
group by user_id
),
tmp_order as
(
select
user_id,
count(*) order_count,
sum(if(activity_reduce_amount>0,1,0)) order_activity_count,
sum(if(coupon_reduce_amount>0,1,0)) order_coupon_count,
sum(activity_reduce_amount) order_activity_reduce_amount,
sum(coupon_reduce_amount) order_coupon_reduce_amount,
sum(original_amount) order_original_amount,
sum(final_amount) order_final_amount
from dwd_order_info
where (dt='2020-06-15'
or dt='9999-99-99')
and date_format(create_time,'yyyy-MM-dd')='2020-06-15'
group by user_id
),
tmp_pay as
(
select
user_id,
count(*) payment_count,
sum(payment_amount) payment_amount
from dwd_payment_info
where dt='2020-06-15'
group by user_id
),
tmp_ri as
(
select
user_id,
count(*) refund_order_count,
sum(refund_num) refund_order_num,
sum(refund_amount) refund_order_amount
from dwd_order_refund_info
where dt='2020-06-15'
group by user_id
),
tmp_rp as
(
select
rp.user_id,
count(*) refund_payment_count,
sum(ri.refund_num) refund_payment_num,
sum(rp.refund_amount) refund_payment_amount
from
(
select
user_id,
order_id,
sku_id,
refund_amount
from dwd_refund_payment
where dt='2020-06-15'
)rp
left join
(
select
user_id,
order_id,
sku_id,
refund_num
from dwd_order_refund_info
where dt>=date_add('2020-06-15',-15)
)ri
on rp.order_id=ri.order_id
and rp.sku_id=rp.sku_id
group by rp.user_id
),
tmp_coupon as
(
select
user_id,
sum(if(date_format(get_time,'yyyy-MM-dd')='2020-06-15',1,0)) coupon_get_count,
sum(if(date_format(using_time,'yyyy-MM-dd')='2020-06-15',1,0)) coupon_using_count,
sum(if(date_format(used_time,'yyyy-MM-dd')='2020-06-15',1,0)) coupon_used_count
from dwd_coupon_use
where (dt='2020-06-15' or dt='9999-99-99')
and (date_format(get_time, 'yyyy-MM-dd') = '2020-06-15'
or date_format(using_time,'yyyy-MM-dd')='2020-06-15'
or date_format(used_time,'yyyy-MM-dd')='2020-06-15')
group by user_id
),
tmp_comment as
(
select
user_id,
sum(if(appraise='1201',1,0)) appraise_good_count,
sum(if(appraise='1202',1,0)) appraise_mid_count,
sum(if(appraise='1203',1,0)) appraise_bad_count,
sum(if(appraise='1204',1,0)) appraise_default_count
from dwd_comment_info
where dt='2020-06-15'
group by user_id
),
tmp_od as
(
select
user_id,
collect_set(named_struct('sku_id',sku_id,'sku_num',sku_num,'order_count',order_count,'activity_reduce_amount',activity_reduce_amount,'coupon_reduce_amount',coupon_reduce_amount,'original_amount',original_amount,'final_amount',final_amount)) order_detail_stats
from
(
select
user_id,
sku_id,
sum(sku_num) sku_num,
count(*) order_count,
cast(sum(split_activity_amount) as decimal(16,2)) activity_reduce_amount,
cast(sum(split_coupon_amount) as decimal(16,2)) coupon_reduce_amount,
cast(sum(original_amount) as decimal(16,2)) original_amount,
cast(sum(split_final_amount) as decimal(16,2)) final_amount
from dwd_order_detail
where dt='2020-06-15'
group by user_id,sku_id
)t1
group by user_id
)
insert overwrite table dws_user_action_daycount partition(dt='2020-06-15')
select
coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id,tmp_coupon.user_id,tmp_od.user_id),
nvl(login_count,0),
nvl(cart_count,0),
nvl(favor_count,0),
nvl(order_count,0),
nvl(order_activity_count,0),
nvl(order_activity_reduce_amount,0),
nvl(order_coupon_count,0),
nvl(order_coupon_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_count,0),
nvl(payment_amount,0),
nvl(refund_order_count,0),
nvl(refund_order_num,0),
nvl(refund_order_amount,0),
nvl(refund_payment_count,0),
nvl(refund_payment_num,0),
nvl(refund_payment_amount,0),
nvl(coupon_get_count,0),
nvl(coupon_using_count,0),
nvl(coupon_used_count,0),
nvl(appraise_good_count,0),
nvl(appraise_mid_count,0),
nvl(appraise_bad_count,0),
nvl(appraise_default_count,0),
order_detail_stats
from tmp_login
full outer join tmp_cf on tmp_login.user_id=tmp_cf.user_id
full outer join tmp_order on coalesce(tmp_login.user_id,tmp_cf.user_id)=tmp_order.user_id
full outer join tmp_pay on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id)=tmp_pay.user_id
full outer join tmp_ri on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id)=tmp_ri.user_id
full outer join tmp_rp on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id)=tmp_rp.user_id
full outer join tmp_comment on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id)=tmp_comment.user_id
full outer join tmp_coupon on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id)=tmp_coupon.user_id
full outer join tmp_od on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id,tmp_coupon.user_id)=tmp_od.user_id;
3)查詢加載結果
4.2.3 每日商品行為
每日商品行為表以商品為中心,通過商品維度有關的事實表獲得與商品相關的不同維度的度量值。
源自的表:
dwd_order_detail 訂單明細事實表
dwd_payment_info 支付事實表
dwd_order_refund_info 退單事實表
dwd_refund_payment 退款事實表
dwd_action_log 動作日志表
dwd_comment_info 評價事實表
1)建表語句
DROP TABLE IF EXISTS dws_sku_action_daycount;
CREATE EXTERNAL TABLE dws_sku_action_daycount
(
`sku_id` STRING COMMENT 'sku_id',
`order_count` BIGINT COMMENT '被下單次數',
`order_num` BIGINT COMMENT '被下單件數',
`order_activity_count` BIGINT COMMENT '參與活動被下單次數',
`order_coupon_count` BIGINT COMMENT '使用優惠券被下單次數',
`order_activity_reduce_amount` DECIMAL(16,2) COMMENT '優惠金額(活動)',
`order_coupon_reduce_amount` DECIMAL(16,2) COMMENT '優惠金額(優惠券)',
`order_original_amount` DECIMAL(16,2) COMMENT '被下單原價金額',
`order_final_amount` DECIMAL(16,2) COMMENT '被下單最終金額',
`payment_count` BIGINT COMMENT '被支付次數',
`payment_num` BIGINT COMMENT '被支付件數',
`payment_amount` DECIMAL(16,2) COMMENT '被支付金額',
`refund_order_count` BIGINT COMMENT '被退單次數',
`refund_order_num` BIGINT COMMENT '被退單件數',
`refund_order_amount` DECIMAL(16,2) COMMENT '被退單金額',
`refund_payment_count` BIGINT COMMENT '被退款次數',
`refund_payment_num` BIGINT COMMENT '被退款件數',
`refund_payment_amount` DECIMAL(16,2) COMMENT '被退款金額',
`cart_count` BIGINT COMMENT '被加入購物車次數',
`favor_count` BIGINT COMMENT '被收藏次數',
`appraise_good_count` BIGINT COMMENT '好評數',
`appraise_mid_count` BIGINT COMMENT '中評數',
`appraise_bad_count` BIGINT COMMENT '差評數',
`appraise_default_count` BIGINT COMMENT '默認評價數'
) COMMENT '每日商品行為'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dws/dws_sku_action_daycount/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)數據裝載
(1)首日裝載
with
-- 下單情況統計,統計每件SKU當天被下單的情況
tmp_order as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
sku_id,
count(*) order_count,
sum(sku_num) order_num,
sum(if(split_activity_amount>0,1,0)) order_activity_count,
sum(if(split_coupon_amount>0,1,0)) order_coupon_count,
sum(split_activity_amount) order_activity_reduce_amount,
sum(split_coupon_amount) order_coupon_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from dwd_order_detail
group by date_format(create_time,'yyyy-MM-dd'),sku_id
),
-- 支付統計
tmp_pay as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
sku_id,
count(*) payment_count,
sum(sku_num) payment_num,
sum(split_final_amount) payment_amount
from dwd_order_detail od
join
(
select
order_id,
callback_time
from dwd_payment_info
where callback_time is not null
)pi on pi.order_id=od.order_id
group by date_format(callback_time,'yyyy-MM-dd'),sku_id
),
tmp_ri as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
sku_id,
count(*) refund_order_count,
sum(refund_num) refund_order_num,
sum(refund_amount) refund_order_amount
from dwd_order_refund_info
group by date_format(create_time,'yyyy-MM-dd'),sku_id
),
tmp_rp as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
rp.sku_id,
count(*) refund_payment_count,
sum(ri.refund_num) refund_payment_num,
sum(refund_amount) refund_payment_amount
from
(
select
order_id,
sku_id,
refund_amount,
callback_time
from dwd_refund_payment
)rp
left join
(
select
order_id,
sku_id,
refund_num
from dwd_order_refund_info
)ri
on rp.order_id=ri.order_id
and rp.sku_id=ri.sku_id
group by date_format(callback_time,'yyyy-MM-dd'),rp.sku_id
),
tmp_cf as
(
select
dt,
item sku_id,
sum(if(action_id='cart_add',1,0)) cart_count,
sum(if(action_id='favor_add',1,0)) favor_count
from dwd_action_log
where action_id in ('cart_add','favor_add')
group by dt,item
),
tmp_comment as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
sku_id,
sum(if(appraise='1201',1,0)) appraise_good_count,
sum(if(appraise='1202',1,0)) appraise_mid_count,
sum(if(appraise='1203',1,0)) appraise_bad_count,
sum(if(appraise='1204',1,0)) appraise_default_count
from dwd_comment_info
group by date_format(create_time,'yyyy-MM-dd'),sku_id
)
insert overwrite table dws_sku_action_daycount partition(dt)
select
sku_id,
sum(order_count),
sum(order_num),
sum(order_activity_count),
sum(order_coupon_count),
sum(order_activity_reduce_amount),
sum(order_coupon_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_num),
sum(payment_amount),
sum(refund_order_count),
sum(refund_order_num),
sum(refund_order_amount),
sum(refund_payment_count),
sum(refund_payment_num),
sum(refund_payment_amount),
sum(cart_count),
sum(favor_count),
sum(appraise_good_count),
sum(appraise_mid_count),
sum(appraise_bad_count),
sum(appraise_default_count),
dt
from
(
select
dt,
sku_id,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_order
union all
select
dt,
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_num,
payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_pay
union all
select
dt,
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_ri
union all
select
dt,
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_rp
union all
select
dt,
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
cart_count,
favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_cf
union all
select
dt,
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from tmp_comment
)t1
group by dt,sku_id;
(2)每日裝載
with
tmp_order as
(
select
sku_id,
count(*) order_count,
sum(sku_num) order_num,
sum(if(split_activity_amount>0,1,0)) order_activity_count,
sum(if(split_coupon_amount>0,1,0)) order_coupon_count,
sum(split_activity_amount) order_activity_reduce_amount,
sum(split_coupon_amount) order_coupon_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from dwd_order_detail
where dt='2020-06-15'
group by sku_id
),
tmp_pay as
(
select
sku_id,
count(*) payment_count,
sum(sku_num) payment_num,
sum(split_final_amount) payment_amount
from dwd_order_detail
where (dt='2020-06-15'
or dt=date_add('2020-06-15',-1))
and order_id in
(
select order_id from dwd_payment_info where dt='2020-06-15'
)
group by sku_id
),
tmp_ri as
(
select
sku_id,
count(*) refund_order_count,
sum(refund_num) refund_order_num,
sum(refund_amount) refund_order_amount
from dwd_order_refund_info
where dt='2020-06-15'
group by sku_id
),
tmp_rp as
(
select
rp.sku_id,
count(*) refund_payment_count,
sum(ri.refund_num) refund_payment_num,
sum(refund_amount) refund_payment_amount
from
(
select
order_id,
sku_id,
refund_amount
from dwd_refund_payment
where dt='2020-06-15'
)rp
left join
(
select
order_id,
sku_id,
refund_num
from dwd_order_refund_info
where dt>=date_add('2020-06-15',-15)
)ri
on rp.order_id=ri.order_id
and rp.sku_id=ri.sku_id
group by rp.sku_id
),
tmp_cf as
(
select
item sku_id,
sum(if(action_id='cart_add',1,0)) cart_count,
sum(if(action_id='favor_add',1,0)) favor_count
from dwd_action_log
where dt='2020-06-15'
and action_id in ('cart_add','favor_add')
group by item
),
tmp_comment as
(
select
sku_id,
sum(if(appraise='1201',1,0)) appraise_good_count,
sum(if(appraise='1202',1,0)) appraise_mid_count,
sum(if(appraise='1203',1,0)) appraise_bad_count,
sum(if(appraise='1204',1,0)) appraise_default_count
from dwd_comment_info
where dt='2020-06-15'
group by sku_id
)
insert overwrite table dws_sku_action_daycount partition(dt='2020-06-15')
select
sku_id,
sum(order_count),
sum(order_num),
sum(order_activity_count),
sum(order_coupon_count),
sum(order_activity_reduce_amount),
sum(order_coupon_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_num),
sum(payment_amount),
sum(refund_order_count),
sum(refund_order_num),
sum(refund_order_amount),
sum(refund_payment_count),
sum(refund_payment_num),
sum(refund_payment_amount),
sum(cart_count),
sum(favor_count),
sum(appraise_good_count),
sum(appraise_mid_count),
sum(appraise_bad_count),
sum(appraise_default_count)
from
(
select
sku_id,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_order
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_num,
payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_pay
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_ri
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_rp
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
cart_count,
favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_cf
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from tmp_comment
)t1
group by sku_id;
3)查詢加載結果
4.2.4 每日優惠券統計
源自的表:
dwd_coupon_use 優惠券領用事實表
dwd_order_detail 訂單明細事實表
dwd_payment_info 支付事實表
1)建表語句
DROP TABLE IF EXISTS dws_coupon_info_daycount;
CREATE EXTERNAL TABLE dws_coupon_info_daycount(
`coupon_id` STRING COMMENT '優惠券ID',
`get_count` BIGINT COMMENT '被領取次數',
`order_count` BIGINT COMMENT '被使用(下單)次數',
`order_reduce_amount` DECIMAL(16,2) COMMENT '用券下單優惠金額',
`order_original_amount` DECIMAL(16,2) COMMENT '用券訂單原價金額',
`order_final_amount` DECIMAL(16,2) COMMENT '用券下單最終金額',
`payment_count` BIGINT COMMENT '被使用(支付)次數',
`payment_reduce_amount` DECIMAL(16,2) COMMENT '用券支付優惠金額',
`payment_amount` DECIMAL(16,2) COMMENT '用券支付總金額',
`expire_count` BIGINT COMMENT '過期次數'
) COMMENT '每日優惠券統計'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dws/dws_coupon_info_daycount/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)數據裝載
(1)首日裝載
with
tmp_cu as
(
select
coalesce(coupon_get.dt,coupon_using.dt,coupon_used.dt,coupon_exprie.dt) dt,
coalesce(coupon_get.coupon_id,coupon_using.coupon_id,coupon_used.coupon_id,coupon_exprie.coupon_id) coupon_id,
nvl(get_count,0) get_count,
nvl(order_count,0) order_count,
nvl(payment_count,0) payment_count,
nvl(expire_count,0) expire_count
from
(
select
date_format(get_time,'yyyy-MM-dd') dt,
coupon_id,
count(*) get_count
from dwd_coupon_use
group by date_format(get_time,'yyyy-MM-dd'),coupon_id
)coupon_get
full outer join
(
select
date_format(using_time,'yyyy-MM-dd') dt,
coupon_id,
count(*) order_count
from dwd_coupon_use
where using_time is not null
group by date_format(using_time,'yyyy-MM-dd'),coupon_id
)coupon_using
on coupon_get.dt=coupon_using.dt
and coupon_get.coupon_id=coupon_using.coupon_id
full outer join
(
select
date_format(used_time,'yyyy-MM-dd') dt,
coupon_id,
count(*) payment_count
from dwd_coupon_use
where used_time is not null
group by date_format(used_time,'yyyy-MM-dd'),coupon_id
)coupon_used
on nvl(coupon_get.dt,coupon_using.dt)=coupon_used.dt
and nvl(coupon_get.coupon_id,coupon_using.coupon_id)=coupon_used.coupon_id
full outer join
(
select
date_format(expire_time,'yyyy-MM-dd') dt,
coupon_id,
count(*) expire_count
from dwd_coupon_use
where expire_time is not null
group by date_format(expire_time,'yyyy-MM-dd'),coupon_id
)coupon_exprie
on coalesce(coupon_get.dt,coupon_using.dt,coupon_used.dt)=coupon_exprie.dt
and coalesce(coupon_get.coupon_id,coupon_using.coupon_id,coupon_used.coupon_id)=coupon_exprie.coupon_id
),
tmp_order as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
coupon_id,
sum(split_coupon_amount) order_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from dwd_order_detail
where coupon_id is not null
group by date_format(create_time,'yyyy-MM-dd'),coupon_id
),
tmp_pay as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
coupon_id,
sum(split_coupon_amount) payment_reduce_amount,
sum(split_final_amount) payment_amount
from
(
select
order_id,
coupon_id,
split_coupon_amount,
split_final_amount
from dwd_order_detail
where coupon_id is not null
)od
join
(
select
order_id,
callback_time
from dwd_payment_info
)pi
on od.order_id=pi.order_id
group by date_format(callback_time,'yyyy-MM-dd'),coupon_id
)
insert overwrite table dws_coupon_info_daycount partition(dt)
select
coupon_id,
sum(get_count),
sum(order_count),
sum(order_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_reduce_amount),
sum(payment_amount),
sum(expire_count),
dt
from
(
select
dt,
coupon_id,
get_count,
order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
0 payment_reduce_amount,
0 payment_amount,
expire_count
from tmp_cu
union all
select
dt,
coupon_id,
0 get_count,
0 order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_reduce_amount,
0 payment_amount,
0 expire_count
from tmp_order
union all
select
dt,
coupon_id,
0 get_count,
0 order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
payment_reduce_amount,
payment_amount,
0 expire_count
from tmp_pay
)t1
group by dt,coupon_id;
(2)每日裝載
with
tmp_cu as
(
select
coupon_id,
sum(if(date_format(get_time,'yyyy-MM-dd')='2020-06-15',1,0)) get_count,
sum(if(date_format(using_time,'yyyy-MM-dd')='2020-06-15',1,0)) order_count,
sum(if(date_format(used_time,'yyyy-MM-dd')='2020-06-15',1,0)) payment_count,
sum(if(date_format(expire_time,'yyyy-MM-dd')='2020-06-15',1,0)) expire_count
from dwd_coupon_use
where dt='9999-99-99'
or dt='2020-06-15'
group by coupon_id
),
tmp_order as
(
select
coupon_id,
sum(split_coupon_amount) order_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from dwd_order_detail
where dt='2020-06-15'
and coupon_id is not null
group by coupon_id
),
tmp_pay as
(
select
coupon_id,
sum(split_coupon_amount) payment_reduce_amount,
sum(split_final_amount) payment_amount
from dwd_order_detail
where (dt='2020-06-15'
or dt=date_add('2020-06-15',-1))
and coupon_id is not null
and order_id in
(
select order_id from dwd_payment_info where dt='2020-06-15'
)
group by coupon_id
)
insert overwrite table dws_coupon_info_daycount partition(dt='2020-06-15')
select
coupon_id,
sum(get_count),
sum(order_count),
sum(order_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_reduce_amount),
sum(payment_amount),
sum(expire_count)
from
(
select
coupon_id,
get_count,
order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
0 payment_reduce_amount,
0 payment_amount,
expire_count
from tmp_cu
union all
select
coupon_id,
0 get_count,
0 order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_reduce_amount,
0 payment_amount,
0 expire_count
from tmp_order
union all
select
coupon_id,
0 get_count,
0 order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
payment_reduce_amount,
payment_amount,
0 expire_count
from tmp_pay
)t1
group by coupon_id;
3)查詢加載結果
4.2.5 每日活動統計
以活動為中心。
源自的表:
dwd_order_detail 訂單明細事實表
dwd_payment_info 支付事實表
1)建表語句
DROP TABLE IF EXISTS dws_activity_info_daycount;
CREATE EXTERNAL TABLE dws_activity_info_daycount(
`activity_rule_id` STRING COMMENT '活動規則ID',
`activity_id` STRING COMMENT '活動ID',
`order_count` BIGINT COMMENT '參與某活動某規則下單次數', `order_reduce_amount` DECIMAL(16,2) COMMENT '參與某活動某規則下單減免金額',
`order_original_amount` DECIMAL(16,2) COMMENT '參與某活動某規則下單原始金額',
`order_final_amount` DECIMAL(16,2) COMMENT '參與某活動某規則下單最終金額',
`payment_count` BIGINT COMMENT '參與某活動某規則支付次數',
`payment_reduce_amount` DECIMAL(16,2) COMMENT '參與某活動某規則支付減免金額',
`payment_amount` DECIMAL(16,2) COMMENT '參與某活動某規則支付金額'
) COMMENT '每日活動統計'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dws/dws_activity_info_daycount/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)數據裝載
(1)首日裝載
with
tmp_order as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
activity_rule_id,
activity_id,
count(*) order_count,
sum(split_activity_amount) order_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from dwd_order_detail
where activity_id is not null
group by date_format(create_time,'yyyy-MM-dd'),activity_rule_id,activity_id
),
tmp_pay as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
activity_rule_id,
activity_id,
count(*) payment_count,
sum(split_activity_amount) payment_reduce_amount,
sum(split_final_amount) payment_amount
from
(
select
activity_rule_id,
activity_id,
order_id,
split_activity_amount,
split_final_amount
from dwd_order_detail
where activity_id is not null
)od
join
(
select
order_id,
callback_time
from dwd_payment_info
)pi
on od.order_id=pi.order_id
group by date_format(callback_time,'yyyy-MM-dd'),activity_rule_id,activity_id
)
insert overwrite table dws_activity_info_daycount partition(dt)
select
activity_rule_id,
activity_id,
sum(order_count),
sum(order_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_reduce_amount),
sum(payment_amount),
dt
from
(
select
dt,
activity_rule_id,
activity_id,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_reduce_amount,
0 payment_amount
from tmp_order
union all
select
dt,
activity_rule_id,
activity_id,
0 order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount
from tmp_pay
)t1
group by dt,activity_rule_id,activity_id;
(2)每日裝載
with
tmp_order as
(
select
activity_rule_id,
activity_id,
count(*) order_count,
sum(split_activity_amount) order_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from dwd_order_detail
where dt='2020-06-15'
and activity_id is not null
group by activity_rule_id,activity_id
),
tmp_pay as
(
select
activity_rule_id,
activity_id,
count(*) payment_count,
sum(split_activity_amount) payment_reduce_amount,
sum(split_final_amount) payment_amount
from dwd_order_detail
where (dt='2020-06-15'
or dt=date_add('2020-06-15',-1))
and activity_id is not null
and order_id in
(
select order_id from dwd_payment_info where dt='2020-06-15'
)
group by activity_rule_id,activity_id
)
insert overwrite table dws_activity_info_daycount partition(dt='2020-06-15')
select
activity_rule_id,
activity_id,
sum(order_count),
sum(order_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_reduce_amount),
sum(payment_amount)
from
(
select
activity_rule_id,
activity_id,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_reduce_amount,
0 payment_amount
from tmp_order
union all
select
activity_rule_id,
activity_id,
0 order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount
from tmp_pay
)t1
group by activity_rule_id,activity_id;
3)查詢加載結果
4.2.6 每日地區統計
1)建表語句
DROP TABLE IF EXISTS dws_area_stats_daycount;
CREATE EXTERNAL TABLE dws_area_stats_daycount(
`province_id` STRING COMMENT '地區編號',
`visit_count` BIGINT COMMENT '訪問次數',
`login_count` BIGINT COMMENT '登錄次數',
`visitor_count` BIGINT COMMENT '訪客人數',
`user_count` BIGINT COMMENT '用戶人數',
`order_count` BIGINT COMMENT '下單次數',
`order_original_amount` DECIMAL(16,2) COMMENT '下單原始金額',
`order_final_amount` DECIMAL(16,2) COMMENT '下單最終金額',
`payment_count` BIGINT COMMENT '支付次數',
`payment_amount` DECIMAL(16,2) COMMENT '支付金額',
`refund_order_count` BIGINT COMMENT '退單次數',
`refund_order_amount` DECIMAL(16,2) COMMENT '退單金額',
`refund_payment_count` BIGINT COMMENT '退款次數',
`refund_payment_amount` DECIMAL(16,2) COMMENT '退款金額'
) COMMENT '每日地區統計表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dws/dws_area_stats_daycount/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)數據裝載
(1)首日裝載
with
tmp_vu as
(
select
dt,
id province_id,
visit_count,
login_count,
visitor_count,
user_count
from
(
select
dt,
area_code,
count(*) visit_count,--訪客訪問次數
count(user_id) login_count,--用戶訪問次數,等價於sum(if(user_id is not null,1,0))
count(distinct(mid_id)) visitor_count,--訪客人數
count(distinct(user_id)) user_count--用戶人數
from dwd_page_log
where last_page_id is null
group by dt,area_code
)tmp
left join dim_base_province area
on tmp.area_code=area.area_code
),
tmp_order as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
province_id,
count(*) order_count,
sum(original_amount) order_original_amount,
sum(final_amount) order_final_amount
from dwd_order_info
group by date_format(create_time,'yyyy-MM-dd'),province_id
),
tmp_pay as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
province_id,
count(*) payment_count,
sum(payment_amount) payment_amount
from dwd_payment_info
group by date_format(callback_time,'yyyy-MM-dd'),province_id
),
tmp_ro as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
province_id,
count(*) refund_order_count,
sum(refund_amount) refund_order_amount
from dwd_order_refund_info
group by date_format(create_time,'yyyy-MM-dd'),province_id
),
tmp_rp as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
province_id,
count(*) refund_payment_count,
sum(refund_amount) refund_payment_amount
from dwd_refund_payment
group by date_format(callback_time,'yyyy-MM-dd'),province_id
)
insert overwrite table dws_area_stats_daycount partition(dt)
select
province_id,
sum(visit_count),
sum(login_count),
sum(visitor_count),
sum(user_count),
sum(order_count),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_amount),
sum(refund_order_count),
sum(refund_order_amount),
sum(refund_payment_count),
sum(refund_payment_amount),
dt
from
(
select
dt,
province_id,
visit_count,
login_count,
visitor_count,
user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_vu
union all
select
dt,
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
order_count,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_order
union all
select
dt,
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_pay
union all
select
dt,
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
refund_order_count,
refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_ro
union all
select
dt,
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
refund_payment_count,
refund_payment_amount
from tmp_rp
)t1
group by dt,province_id;
(2)每日裝載
with
tmp_vu as
(
select
id province_id,
visit_count,
login_count,
visitor_count,
user_count
from
(
select
area_code,
count(*) visit_count,--訪客訪問次數
count(user_id) login_count,--用戶訪問次數,等價於sum(if(user_id is not null,1,0))
count(distinct(mid_id)) visitor_count,--訪客人數
count(distinct(user_id)) user_count--用戶人數
from dwd_page_log
where dt='2020-06-15'
and last_page_id is null
group by area_code
)tmp
left join dim_base_province area
on tmp.area_code=area.area_code
),
tmp_order as
(
select
province_id,
count(*) order_count,
sum(original_amount) order_original_amount,
sum(final_amount) order_final_amount
from dwd_order_info
where dt='2020-06-15'
or dt='9999-99-99'
and date_format(create_time,'yyyy-MM-dd')='2020-06-15'
group by province_id
),
tmp_pay as
(
select
province_id,
count(*) payment_count,
sum(payment_amount) payment_amount
from dwd_payment_info
where dt='2020-06-15'
group by province_id
),
tmp_ro as
(
select
province_id,
count(*) refund_order_count,
sum(refund_amount) refund_order_amount
from dwd_order_refund_info
where dt='2020-06-15'
group by province_id
),
tmp_rp as
(
select
province_id,
count(*) refund_payment_count,
sum(refund_amount) refund_payment_amount
from dwd_refund_payment
where dt='2020-06-15'
group by province_id
)
insert overwrite table dws_area_stats_daycount partition(dt='2020-06-15')
select
province_id,
sum(visit_count),
sum(login_count),
sum(visitor_count),
sum(user_count),
sum(order_count),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_amount),
sum(refund_order_count),
sum(refund_order_amount),
sum(refund_payment_count),
sum(refund_payment_amount)
from
(
select
province_id,
visit_count,
login_count,
visitor_count,
user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_vu
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
order_count,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_order
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_pay
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
refund_order_count,
refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_ro
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
refund_payment_count,
refund_payment_amount
from tmp_rp
)t1
group by province_id;
3)查詢加載結果
4.3 DWS層首日數據裝載腳本
1)編寫腳本
(1)在/home/atguigu/bin目錄下創建腳本dwd_to_dws_init.sh
#!/bin/bash
APP=gmall
if [ -n "$2" ] ;then
do_date=$2
else
echo "請傳入日期參數"
exit
fi
dws_visitor_action_daycount="
insert overwrite table ${APP}.dws_visitor_action_daycount partition(dt='$do_date')
select
t1.mid_id,
t1.brand,
t1.model,
t1.is_new,
t1.channel,
t1.os,
t1.area_code,
t1.version_code,
t1.visit_count,
t3.page_stats
from
(
select
mid_id,
brand,
model,
if(array_contains(collect_set(is_new),'0'),'0','1') is_new,--ods_page_log中,同一天內,同一設備的is_new字段,可能全部為1,可能全部為0,也可能部分為0,部分為1(卸載重裝),故做該處理
collect_set(channel) channel,
collect_set(os) os,
collect_set(area_code) area_code,
collect_set(version_code) version_code,
sum(if(last_page_id is null,1,0)) visit_count
from ${APP}.dwd_page_log
where dt='$do_date'
and last_page_id is null
group by mid_id,model,brand
)t1
join
(
select
mid_id,
brand,
model,
collect_set(named_struct('page_id',page_id,'page_count',page_count,'during_time',during_time)) page_stats
from
(
select
mid_id,
brand,
model,
page_id,
count(*) page_count,
sum(during_time) during_time
from ${APP}.dwd_page_log
where dt='$do_date'
group by mid_id,model,brand,page_id
)t2
group by mid_id,model,brand
)t3
on t1.mid_id=t3.mid_id
and t1.brand=t3.brand
and t1.model=t3.model;
"
dws_area_stats_daycount="
set hive.exec.dynamic.partition.mode=nonstrict;
with
tmp_vu as
(
select
dt,
id province_id,
visit_count,
login_count,
visitor_count,
user_count
from
(
select
dt,
area_code,
count(*) visit_count,--訪客訪問次數
count(user_id) login_count,--用戶訪問次數,等價於sum(if(user_id is not null,1,0))
count(distinct(mid_id)) visitor_count,--訪客人數
count(distinct(user_id)) user_count--用戶人數
from ${APP}.dwd_page_log
where last_page_id is null
group by dt,area_code
)tmp
left join ${APP}.dim_base_province area
on tmp.area_code=area.area_code
),
tmp_order as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
province_id,
count(*) order_count,
sum(original_amount) order_original_amount,
sum(final_amount) order_final_amount
from ${APP}.dwd_order_info
group by date_format(create_time,'yyyy-MM-dd'),province_id
),
tmp_pay as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
province_id,
count(*) payment_count,
sum(payment_amount) payment_amount
from ${APP}.dwd_payment_info
group by date_format(callback_time,'yyyy-MM-dd'),province_id
),
tmp_ro as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
province_id,
count(*) refund_order_count,
sum(refund_amount) refund_order_amount
from ${APP}.dwd_order_refund_info
group by date_format(create_time,'yyyy-MM-dd'),province_id
),
tmp_rp as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
province_id,
count(*) refund_payment_count,
sum(refund_amount) refund_payment_amount
from ${APP}.dwd_refund_payment
group by date_format(callback_time,'yyyy-MM-dd'),province_id
)
insert overwrite table ${APP}.dws_area_stats_daycount partition(dt)
select
province_id,
sum(visit_count),
sum(login_count),
sum(visitor_count),
sum(user_count),
sum(order_count),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_amount),
sum(refund_order_count),
sum(refund_order_amount),
sum(refund_payment_count),
sum(refund_payment_amount),
dt
from
(
select
dt,
province_id,
visit_count,
login_count,
visitor_count,
user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_vu
union all
select
dt,
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
order_count,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_order
union all
select
dt,
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_pay
union all
select
dt,
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
refund_order_count,
refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_ro
union all
select
dt,
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
refund_payment_count,
refund_payment_amount
from tmp_rp
)t1
group by dt,province_id;
"
dws_user_action_daycount="
set hive.exec.dynamic.partition.mode=nonstrict;
with
tmp_login as
(
select
dt,
user_id,
count(*) login_count
from ${APP}.dwd_page_log
where user_id is not null
and last_page_id is null
group by dt,user_id
),
tmp_cf as
(
select
dt,
user_id,
sum(if(action_id='cart_add',1,0)) cart_count,
sum(if(action_id='favor_add',1,0)) favor_count
from ${APP}.dwd_action_log
where user_id is not null
and action_id in ('cart_add','favor_add')
group by dt,user_id
),
tmp_order as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
user_id,
count(*) order_count,
sum(if(activity_reduce_amount>0,1,0)) order_activity_count,
sum(if(coupon_reduce_amount>0,1,0)) order_coupon_count,
sum(activity_reduce_amount) order_activity_reduce_amount,
sum(coupon_reduce_amount) order_coupon_reduce_amount,
sum(original_amount) order_original_amount,
sum(final_amount) order_final_amount
from ${APP}.dwd_order_info
group by date_format(create_time,'yyyy-MM-dd'),user_id
),
tmp_pay as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
user_id,
count(*) payment_count,
sum(payment_amount) payment_amount
from ${APP}.dwd_payment_info
group by date_format(callback_time,'yyyy-MM-dd'),user_id
),
tmp_ri as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
user_id,
count(*) refund_order_count,
sum(refund_num) refund_order_num,
sum(refund_amount) refund_order_amount
from ${APP}.dwd_order_refund_info
group by date_format(create_time,'yyyy-MM-dd'),user_id
),
tmp_rp as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
rp.user_id,
count(*) refund_payment_count,
sum(ri.refund_num) refund_payment_num,
sum(rp.refund_amount) refund_payment_amount
from
(
select
user_id,
order_id,
sku_id,
refund_amount,
callback_time
from ${APP}.dwd_refund_payment
)rp
left join
(
select
user_id,
order_id,
sku_id,
refund_num
from ${APP}.dwd_order_refund_info
)ri
on rp.order_id=ri.order_id
and rp.sku_id=rp.sku_id
group by date_format(callback_time,'yyyy-MM-dd'),rp.user_id
),
tmp_coupon as
(
select
coalesce(coupon_get.dt,coupon_using.dt,coupon_used.dt) dt,
coalesce(coupon_get.user_id,coupon_using.user_id,coupon_used.user_id) user_id,
nvl(coupon_get_count,0) coupon_get_count,
nvl(coupon_using_count,0) coupon_using_count,
nvl(coupon_used_count,0) coupon_used_count
from
(
select
date_format(get_time,'yyyy-MM-dd') dt,
user_id,
count(*) coupon_get_count
from ${APP}.dwd_coupon_use
where get_time is not null
group by user_id,date_format(get_time,'yyyy-MM-dd')
)coupon_get
full outer join
(
select
date_format(using_time,'yyyy-MM-dd') dt,
user_id,
count(*) coupon_using_count
from ${APP}.dwd_coupon_use
where using_time is not null
group by user_id,date_format(using_time,'yyyy-MM-dd')
)coupon_using
on coupon_get.dt=coupon_using.dt
and coupon_get.user_id=coupon_using.user_id
full outer join
(
select
date_format(used_time,'yyyy-MM-dd') dt,
user_id,
count(*) coupon_used_count
from ${APP}.dwd_coupon_use
where used_time is not null
group by user_id,date_format(used_time,'yyyy-MM-dd')
)coupon_used
on nvl(coupon_get.dt,coupon_using.dt)=coupon_used.dt
and nvl(coupon_get.user_id,coupon_using.user_id)=coupon_used.user_id
),
tmp_comment as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
user_id,
sum(if(appraise='1201',1,0)) appraise_good_count,
sum(if(appraise='1202',1,0)) appraise_mid_count,
sum(if(appraise='1203',1,0)) appraise_bad_count,
sum(if(appraise='1204',1,0)) appraise_default_count
from ${APP}.dwd_comment_info
group by date_format(create_time,'yyyy-MM-dd'),user_id
),
tmp_od as
(
select
dt,
user_id,
collect_set(named_struct('sku_id',sku_id,'sku_num',sku_num,'order_count',order_count,'activity_reduce_amount',activity_reduce_amount,'coupon_reduce_amount',coupon_reduce_amount,'original_amount',original_amount,'final_amount',final_amount)) order_detail_stats
from
(
select
date_format(create_time,'yyyy-MM-dd') dt,
user_id,
sku_id,
sum(sku_num) sku_num,
count(*) order_count,
cast(sum(split_activity_amount) as decimal(16,2)) activity_reduce_amount,
cast(sum(split_coupon_amount) as decimal(16,2)) coupon_reduce_amount,
cast(sum(original_amount) as decimal(16,2)) original_amount,
cast(sum(split_final_amount) as decimal(16,2)) final_amount
from ${APP}.dwd_order_detail
group by date_format(create_time,'yyyy-MM-dd'),user_id,sku_id
)t1
group by dt,user_id
)
insert overwrite table ${APP}.dws_user_action_daycount partition(dt)
select
coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id,tmp_coupon.user_id,tmp_od.user_id),
nvl(login_count,0),
nvl(cart_count,0),
nvl(favor_count,0),
nvl(order_count,0),
nvl(order_activity_count,0),
nvl(order_activity_reduce_amount,0),
nvl(order_coupon_count,0),
nvl(order_coupon_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_count,0),
nvl(payment_amount,0),
nvl(refund_order_count,0),
nvl(refund_order_num,0),
nvl(refund_order_amount,0),
nvl(refund_payment_count,0),
nvl(refund_payment_num,0),
nvl(refund_payment_amount,0),
nvl(coupon_get_count,0),
nvl(coupon_using_count,0),
nvl(coupon_used_count,0),
nvl(appraise_good_count,0),
nvl(appraise_mid_count,0),
nvl(appraise_bad_count,0),
nvl(appraise_default_count,0),
order_detail_stats,
coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt,tmp_ri.dt,tmp_rp.dt,tmp_comment.dt,tmp_coupon.dt,tmp_od.dt)
from tmp_login
full outer join tmp_cf
on tmp_login.user_id=tmp_cf.user_id
and tmp_login.dt=tmp_cf.dt
full outer join tmp_order
on coalesce(tmp_login.user_id,tmp_cf.user_id)=tmp_order.user_id
and coalesce(tmp_login.dt,tmp_cf.dt)=tmp_order.dt
full outer join tmp_pay
on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id)=tmp_pay.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt)=tmp_pay.dt
full outer join tmp_ri
on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id)=tmp_ri.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt)=tmp_ri.dt
full outer join tmp_rp
on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id)=tmp_rp.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt,tmp_ri.dt)=tmp_rp.dt
full outer join tmp_comment
on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id)=tmp_comment.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt,tmp_ri.dt,tmp_rp.dt)=tmp_comment.dt
full outer join tmp_coupon
on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id)=tmp_coupon.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt,tmp_ri.dt,tmp_rp.dt,tmp_comment.dt)=tmp_coupon.dt
full outer join tmp_od
on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id,tmp_coupon.user_id)=tmp_od.user_id
and coalesce(tmp_login.dt,tmp_cf.dt,tmp_order.dt,tmp_pay.dt,tmp_ri.dt,tmp_rp.dt,tmp_comment.dt,tmp_coupon.dt)=tmp_od.dt;
"
dws_activity_info_daycount="
set hive.exec.dynamic.partition.mode=nonstrict;
with
tmp_order as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
activity_rule_id,
activity_id,
count(*) order_count,
sum(split_activity_amount) order_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from ${APP}.dwd_order_detail
where activity_id is not null
group by date_format(create_time,'yyyy-MM-dd'),activity_rule_id,activity_id
),
tmp_pay as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
activity_rule_id,
activity_id,
count(*) payment_count,
sum(split_activity_amount) payment_reduce_amount,
sum(split_final_amount) payment_amount
from
(
select
activity_rule_id,
activity_id,
order_id,
split_activity_amount,
split_final_amount
from ${APP}.dwd_order_detail
where activity_id is not null
)od
join
(
select
order_id,
callback_time
from ${APP}.dwd_payment_info
)pi
on od.order_id=pi.order_id
group by date_format(callback_time,'yyyy-MM-dd'),activity_rule_id,activity_id
)
insert overwrite table ${APP}.dws_activity_info_daycount partition(dt)
select
activity_rule_id,
activity_id,
sum(order_count),
sum(order_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_reduce_amount),
sum(payment_amount),
dt
from
(
select
dt,
activity_rule_id,
activity_id,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_reduce_amount,
0 payment_amount
from tmp_order
union all
select
dt,
activity_rule_id,
activity_id,
0 order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount
from tmp_pay
)t1
group by dt,activity_rule_id,activity_id;"
dws_sku_action_daycount="
set hive.exec.dynamic.partition.mode=nonstrict;
with
tmp_order as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
sku_id,
count(*) order_count,
sum(sku_num) order_num,
sum(if(split_activity_amount>0,1,0)) order_activity_count,
sum(if(split_coupon_amount>0,1,0)) order_coupon_count,
sum(split_activity_amount) order_activity_reduce_amount,
sum(split_coupon_amount) order_coupon_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from ${APP}.dwd_order_detail
group by date_format(create_time,'yyyy-MM-dd'),sku_id
),
tmp_pay as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
sku_id,
count(*) payment_count,
sum(sku_num) payment_num,
sum(split_final_amount) payment_amount
from ${APP}.dwd_order_detail od
join
(
select
order_id,
callback_time
from ${APP}.dwd_payment_info
where callback_time is not null
)pi on pi.order_id=od.order_id
group by date_format(callback_time,'yyyy-MM-dd'),sku_id
),
tmp_ri as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
sku_id,
count(*) refund_order_count,
sum(refund_num) refund_order_num,
sum(refund_amount) refund_order_amount
from ${APP}.dwd_order_refund_info
group by date_format(create_time,'yyyy-MM-dd'),sku_id
),
tmp_rp as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
rp.sku_id,
count(*) refund_payment_count,
sum(ri.refund_num) refund_payment_num,
sum(refund_amount) refund_payment_amount
from
(
select
order_id,
sku_id,
refund_amount,
callback_time
from ${APP}.dwd_refund_payment
)rp
left join
(
select
order_id,
sku_id,
refund_num
from ${APP}.dwd_order_refund_info
)ri
on rp.order_id=ri.order_id
and rp.sku_id=ri.sku_id
group by date_format(callback_time,'yyyy-MM-dd'),rp.sku_id
),
tmp_cf as
(
select
dt,
item sku_id,
sum(if(action_id='cart_add',1,0)) cart_count,
sum(if(action_id='favor_add',1,0)) favor_count
from ${APP}.dwd_action_log
where action_id in ('cart_add','favor_add')
group by dt,item
),
tmp_comment as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
sku_id,
sum(if(appraise='1201',1,0)) appraise_good_count,
sum(if(appraise='1202',1,0)) appraise_mid_count,
sum(if(appraise='1203',1,0)) appraise_bad_count,
sum(if(appraise='1204',1,0)) appraise_default_count
from ${APP}.dwd_comment_info
group by date_format(create_time,'yyyy-MM-dd'),sku_id
)
insert overwrite table ${APP}.dws_sku_action_daycount partition(dt)
select
sku_id,
sum(order_count),
sum(order_num),
sum(order_activity_count),
sum(order_coupon_count),
sum(order_activity_reduce_amount),
sum(order_coupon_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_num),
sum(payment_amount),
sum(refund_order_count),
sum(refund_order_num),
sum(refund_order_amount),
sum(refund_payment_count),
sum(refund_payment_num),
sum(refund_payment_amount),
sum(cart_count),
sum(favor_count),
sum(appraise_good_count),
sum(appraise_mid_count),
sum(appraise_bad_count),
sum(appraise_default_count),
dt
from
(
select
dt,
sku_id,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_order
union all
select
dt,
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_num,
payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_pay
union all
select
dt,
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_ri
union all
select
dt,
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_rp
union all
select
dt,
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
cart_count,
favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_cf
union all
select
dt,
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from tmp_comment
)t1
group by dt,sku_id;"
dws_coupon_info_daycount="
set hive.exec.dynamic.partition.mode=nonstrict;
with
tmp_cu as
(
select
coalesce(coupon_get.dt,coupon_using.dt,coupon_used.dt,coupon_exprie.dt) dt,
coalesce(coupon_get.coupon_id,coupon_using.coupon_id,coupon_used.coupon_id,coupon_exprie.coupon_id) coupon_id,
nvl(get_count,0) get_count,
nvl(order_count,0) order_count,
nvl(payment_count,0) payment_count,
nvl(expire_count,0) expire_count
from
(
select
date_format(get_time,'yyyy-MM-dd') dt,
coupon_id,
count(*) get_count
from ${APP}.dwd_coupon_use
group by date_format(get_time,'yyyy-MM-dd'),coupon_id
)coupon_get
full outer join
(
select
date_format(using_time,'yyyy-MM-dd') dt,
coupon_id,
count(*) order_count
from ${APP}.dwd_coupon_use
where using_time is not null
group by date_format(using_time,'yyyy-MM-dd'),coupon_id
)coupon_using
on coupon_get.dt=coupon_using.dt
and coupon_get.coupon_id=coupon_using.coupon_id
full outer join
(
select
date_format(used_time,'yyyy-MM-dd') dt,
coupon_id,
count(*) payment_count
from ${APP}.dwd_coupon_use
where used_time is not null
group by date_format(used_time,'yyyy-MM-dd'),coupon_id
)coupon_used
on nvl(coupon_get.dt,coupon_using.dt)=coupon_used.dt
and nvl(coupon_get.coupon_id,coupon_using.coupon_id)=coupon_used.coupon_id
full outer join
(
select
date_format(expire_time,'yyyy-MM-dd') dt,
coupon_id,
count(*) expire_count
from ${APP}.dwd_coupon_use
where expire_time is not null
group by date_format(expire_time,'yyyy-MM-dd'),coupon_id
)coupon_exprie
on coalesce(coupon_get.dt,coupon_using.dt,coupon_used.dt)=coupon_exprie.dt
and coalesce(coupon_get.coupon_id,coupon_using.coupon_id,coupon_used.coupon_id)=coupon_exprie.coupon_id
),
tmp_order as
(
select
date_format(create_time,'yyyy-MM-dd') dt,
coupon_id,
sum(split_coupon_amount) order_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from ${APP}.dwd_order_detail
where coupon_id is not null
group by date_format(create_time,'yyyy-MM-dd'),coupon_id
),
tmp_pay as
(
select
date_format(callback_time,'yyyy-MM-dd') dt,
coupon_id,
sum(split_coupon_amount) payment_reduce_amount,
sum(split_final_amount) payment_amount
from
(
select
order_id,
coupon_id,
split_coupon_amount,
split_final_amount
from ${APP}.dwd_order_detail
where coupon_id is not null
)od
join
(
select
order_id,
callback_time
from ${APP}.dwd_payment_info
)pi
on od.order_id=pi.order_id
group by date_format(callback_time,'yyyy-MM-dd'),coupon_id
)
insert overwrite table ${APP}.dws_coupon_info_daycount partition(dt)
select
coupon_id,
sum(get_count),
sum(order_count),
sum(order_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_reduce_amount),
sum(payment_amount),
sum(expire_count),
dt
from
(
select
dt,
coupon_id,
get_count,
order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
0 payment_reduce_amount,
0 payment_amount,
expire_count
from tmp_cu
union all
select
dt,
coupon_id,
0 get_count,
0 order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_reduce_amount,
0 payment_amount,
0 expire_count
from tmp_order
union all
select
dt,
coupon_id,
0 get_count,
0 order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
payment_reduce_amount,
payment_amount,
0 expire_count
from tmp_pay
)t1
group by dt,coupon_id;
"
case $1 in
"dws_visitor_action_daycount" )
hive -e "$dws_visitor_action_daycount"
;;
"dws_user_action_daycount" )
hive -e "$dws_user_action_daycount"
;;
"dws_activity_info_daycount" )
hive -e "$dws_activity_info_daycount"
;;
"dws_area_stats_daycount" )
hive -e "$dws_area_stats_daycount"
;;
"dws_sku_action_daycount" )
hive -e "$dws_sku_action_daycount"
;;
"dws_coupon_info_daycount" )
hive -e "$dws_coupon_info_daycount"
;;
"all" )
hive -e "$dws_visitor_action_daycount$dws_user_action_daycount$dws_activity_info_daycount$dws_area_stats_daycount$dws_sku_action_daycount$dws_coupon_info_daycount"
;;
esac
(2)增加執行權限
[atguigu@hadoop102 bin]$ chmod +x dwd_to_dws_init.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 bin]$ dwd_to_dws_init.sh all 2020-06-14
(2)查看數據是否導入成功
4.4 DWS層每日數據裝載腳本
1)編寫腳本
(1)在/home/atguigu/bin目錄下創建腳本dwd_to_dws.sh
#!/bin/bash
APP=gmall
# 如果是輸入的日期按照取輸入日期;如果沒輸入日期取當前時間的前一天
if [ -n "$2" ] ;then
do_date=$2
else
do_date=`date -d "-1 day" +%F`
fi
dws_visitor_action_daycount="insert overwrite table ${APP}.dws_visitor_action_daycount partition(dt='$do_date')
select
t1.mid_id,
t1.brand,
t1.model,
t1.is_new,
t1.channel,
t1.os,
t1.area_code,
t1.version_code,
t1.visit_count,
t3.page_stats
from
(
select
mid_id,
brand,
model,
if(array_contains(collect_set(is_new),'0'),'0','1') is_new,--ods_page_log中,同一天內,同一設備的is_new字段,可能全部為1,可能全部為0,也可能部分為0,部分為1(卸載重裝),故做該處理
collect_set(channel) channel,
collect_set(os) os,
collect_set(area_code) area_code,
collect_set(version_code) version_code,
sum(if(last_page_id is null,1,0)) visit_count
from ${APP}.dwd_page_log
where dt='$do_date'
and last_page_id is null
group by mid_id,model,brand
)t1
join
(
select
mid_id,
brand,
model,
collect_set(named_struct('page_id',page_id,'page_count',page_count,'during_time',during_time)) page_stats
from
(
select
mid_id,
brand,
model,
page_id,
count(*) page_count,
sum(during_time) during_time
from ${APP}.dwd_page_log
where dt='$do_date'
group by mid_id,model,brand,page_id
)t2
group by mid_id,model,brand
)t3
on t1.mid_id=t3.mid_id
and t1.brand=t3.brand
and t1.model=t3.model;"
dws_user_action_daycount="
with
tmp_login as
(
select
user_id,
count(*) login_count
from ${APP}.dwd_page_log
where dt='$do_date'
and user_id is not null
and last_page_id is null
group by user_id
),
tmp_cf as
(
select
user_id,
sum(if(action_id='cart_add',1,0)) cart_count,
sum(if(action_id='favor_add',1,0)) favor_count
from ${APP}.dwd_action_log
where dt='$do_date'
and user_id is not null
and action_id in ('cart_add','favor_add')
group by user_id
),
tmp_order as
(
select
user_id,
count(*) order_count,
sum(if(activity_reduce_amount>0,1,0)) order_activity_count,
sum(if(coupon_reduce_amount>0,1,0)) order_coupon_count,
sum(activity_reduce_amount) order_activity_reduce_amount,
sum(coupon_reduce_amount) order_coupon_reduce_amount,
sum(original_amount) order_original_amount,
sum(final_amount) order_final_amount
from ${APP}.dwd_order_info
where (dt='$do_date'
or dt='9999-99-99')
and date_format(create_time,'yyyy-MM-dd')='$do_date'
group by user_id
),
tmp_pay as
(
select
user_id,
count(*) payment_count,
sum(payment_amount) payment_amount
from ${APP}.dwd_payment_info
where dt='$do_date'
group by user_id
),
tmp_ri as
(
select
user_id,
count(*) refund_order_count,
sum(refund_num) refund_order_num,
sum(refund_amount) refund_order_amount
from ${APP}.dwd_order_refund_info
where dt='$do_date'
group by user_id
),
tmp_rp as
(
select
rp.user_id,
count(*) refund_payment_count,
sum(ri.refund_num) refund_payment_num,
sum(rp.refund_amount) refund_payment_amount
from
(
select
user_id,
order_id,
sku_id,
refund_amount
from ${APP}.dwd_refund_payment
where dt='$do_date'
)rp
left join
(
select
user_id,
order_id,
sku_id,
refund_num
from ${APP}.dwd_order_refund_info
where dt>=date_add('$do_date',-15)
)ri
on rp.order_id=ri.order_id
and rp.sku_id=rp.sku_id
group by rp.user_id
),
tmp_coupon as
(
select
user_id,
sum(if(date_format(get_time,'yyyy-MM-dd')='$do_date',1,0)) coupon_get_count,
sum(if(date_format(using_time,'yyyy-MM-dd')='$do_date',1,0)) coupon_using_count,
sum(if(date_format(used_time,'yyyy-MM-dd')='$do_date',1,0)) coupon_used_count
from ${APP}.dwd_coupon_use
where (dt='$do_date' or dt='9999-99-99')
and (date_format(get_time, 'yyyy-MM-dd') = '$do_date'
or date_format(using_time,'yyyy-MM-dd')='$do_date'
or date_format(used_time,'yyyy-MM-dd')='$do_date')
group by user_id
),
tmp_comment as
(
select
user_id,
sum(if(appraise='1201',1,0)) appraise_good_count,
sum(if(appraise='1202',1,0)) appraise_mid_count,
sum(if(appraise='1203',1,0)) appraise_bad_count,
sum(if(appraise='1204',1,0)) appraise_default_count
from ${APP}.dwd_comment_info
where dt='$do_date'
group by user_id
),
tmp_od as
(
select
user_id,
collect_set(named_struct('sku_id',sku_id,'sku_num',sku_num,'order_count',order_count,'activity_reduce_amount',activity_reduce_amount,'coupon_reduce_amount',coupon_reduce_amount,'original_amount',original_amount,'final_amount',final_amount)) order_detail_stats
from
(
select
user_id,
sku_id,
sum(sku_num) sku_num,
count(*) order_count,
cast(sum(split_activity_amount) as decimal(16,2)) activity_reduce_amount,
cast(sum(split_coupon_amount) as decimal(16,2)) coupon_reduce_amount,
cast(sum(original_amount) as decimal(16,2)) original_amount,
cast(sum(split_final_amount) as decimal(16,2)) final_amount
from ${APP}.dwd_order_detail
where dt='$do_date'
group by user_id,sku_id
)t1
group by user_id
)
insert overwrite table ${APP}.dws_user_action_daycount partition(dt='$do_date')
select
coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id,tmp_coupon.user_id,tmp_od.user_id),
nvl(login_count,0),
nvl(cart_count,0),
nvl(favor_count,0),
nvl(order_count,0),
nvl(order_activity_count,0),
nvl(order_activity_reduce_amount,0),
nvl(order_coupon_count,0),
nvl(order_coupon_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_count,0),
nvl(payment_amount,0),
nvl(refund_order_count,0),
nvl(refund_order_num,0),
nvl(refund_order_amount,0),
nvl(refund_payment_count,0),
nvl(refund_payment_num,0),
nvl(refund_payment_amount,0),
nvl(coupon_get_count,0),
nvl(coupon_using_count,0),
nvl(coupon_used_count,0),
nvl(appraise_good_count,0),
nvl(appraise_mid_count,0),
nvl(appraise_bad_count,0),
nvl(appraise_default_count,0),
order_detail_stats
from tmp_login
full outer join tmp_cf on tmp_login.user_id=tmp_cf.user_id
full outer join tmp_order on coalesce(tmp_login.user_id,tmp_cf.user_id)=tmp_order.user_id
full outer join tmp_pay on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id)=tmp_pay.user_id
full outer join tmp_ri on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id)=tmp_ri.user_id
full outer join tmp_rp on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id)=tmp_rp.user_id
full outer join tmp_comment on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id)=tmp_comment.user_id
full outer join tmp_coupon on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id)=tmp_coupon.user_id
full outer join tmp_od on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id,tmp_coupon.user_id)=tmp_od.user_id;
"
dws_activity_info_daycount="
with
tmp_order as
(
select
activity_rule_id,
activity_id,
count(*) order_count,
sum(split_activity_amount) order_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from ${APP}.dwd_order_detail
where dt='$do_date'
and activity_id is not null
group by activity_rule_id,activity_id
),
tmp_pay as
(
select
activity_rule_id,
activity_id,
count(*) payment_count,
sum(split_activity_amount) payment_reduce_amount,
sum(split_final_amount) payment_amount
from ${APP}.dwd_order_detail
where (dt='$do_date'
or dt=date_add('$do_date',-1))
and activity_id is not null
and order_id in
(
select order_id from ${APP}.dwd_payment_info where dt='$do_date'
)
group by activity_rule_id,activity_id
)
insert overwrite table ${APP}.dws_activity_info_daycount partition(dt='$do_date')
select
activity_rule_id,
activity_id,
sum(order_count),
sum(order_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_reduce_amount),
sum(payment_amount)
from
(
select
activity_rule_id,
activity_id,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_reduce_amount,
0 payment_amount
from tmp_order
union all
select
activity_rule_id,
activity_id,
0 order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount
from tmp_pay
)t1
group by activity_rule_id,activity_id;"
dws_sku_action_daycount="
with
tmp_order as
(
select
sku_id,
count(*) order_count,
sum(sku_num) order_num,
sum(if(split_activity_amount>0,1,0)) order_activity_count,
sum(if(split_coupon_amount>0,1,0)) order_coupon_count,
sum(split_activity_amount) order_activity_reduce_amount,
sum(split_coupon_amount) order_coupon_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from ${APP}.dwd_order_detail
where dt='$do_date'
group by sku_id
),
tmp_pay as
(
select
sku_id,
count(*) payment_count,
sum(sku_num) payment_num,
sum(split_final_amount) payment_amount
from ${APP}.dwd_order_detail
where (dt='$do_date'
or dt=date_add('$do_date',-1))
and order_id in
(
select order_id from ${APP}.dwd_payment_info where dt='$do_date'
)
group by sku_id
),
tmp_ri as
(
select
sku_id,
count(*) refund_order_count,
sum(refund_num) refund_order_num,
sum(refund_amount) refund_order_amount
from ${APP}.dwd_order_refund_info
where dt='$do_date'
group by sku_id
),
tmp_rp as
(
select
rp.sku_id,
count(*) refund_payment_count,
sum(ri.refund_num) refund_payment_num,
sum(refund_amount) refund_payment_amount
from
(
select
order_id,
sku_id,
refund_amount
from ${APP}.dwd_refund_payment
where dt='$do_date'
)rp
left join
(
select
order_id,
sku_id,
refund_num
from ${APP}.dwd_order_refund_info
where dt>=date_add('$do_date',-15)
)ri
on rp.order_id=ri.order_id
and rp.sku_id=ri.sku_id
group by rp.sku_id
),
tmp_cf as
(
select
item sku_id,
sum(if(action_id='cart_add',1,0)) cart_count,
sum(if(action_id='favor_add',1,0)) favor_count
from ${APP}.dwd_action_log
where dt='$do_date'
and action_id in ('cart_add','favor_add')
group by item
),
tmp_comment as
(
select
sku_id,
sum(if(appraise='1201',1,0)) appraise_good_count,
sum(if(appraise='1202',1,0)) appraise_mid_count,
sum(if(appraise='1203',1,0)) appraise_bad_count,
sum(if(appraise='1204',1,0)) appraise_default_count
from ${APP}.dwd_comment_info
where dt='$do_date'
group by sku_id
)
insert overwrite table ${APP}.dws_sku_action_daycount partition(dt='$do_date')
select
sku_id,
sum(order_count),
sum(order_num),
sum(order_activity_count),
sum(order_coupon_count),
sum(order_activity_reduce_amount),
sum(order_coupon_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_num),
sum(payment_amount),
sum(refund_order_count),
sum(refund_order_num),
sum(refund_order_amount),
sum(refund_payment_count),
sum(refund_payment_num),
sum(refund_payment_amount),
sum(cart_count),
sum(favor_count),
sum(appraise_good_count),
sum(appraise_mid_count),
sum(appraise_bad_count),
sum(appraise_default_count)
from
(
select
sku_id,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_order
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_num,
payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_pay
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_ri
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_rp
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
cart_count,
favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_cf
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from tmp_comment
)t1
group by sku_id;"
dws_coupon_info_daycount="
with
tmp_cu as
(
select
coupon_id,
sum(if(date_format(get_time,'yyyy-MM-dd')='$do_date',1,0)) get_count,
sum(if(date_format(using_time,'yyyy-MM-dd')='$do_date',1,0)) order_count,
sum(if(date_format(used_time,'yyyy-MM-dd')='$do_date',1,0)) payment_count,
sum(if(date_format(expire_time,'yyyy-MM-dd')='$do_date',1,0)) expire_count
from ${APP}.dwd_coupon_use
where dt='9999-99-99'
or dt='$do_date'
group by coupon_id
),
tmp_order as
(
select
coupon_id,
sum(split_coupon_amount) order_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from ${APP}.dwd_order_detail
where dt='$do_date'
and coupon_id is not null
group by coupon_id
),
tmp_pay as
(
select
coupon_id,
sum(split_coupon_amount) payment_reduce_amount,
sum(split_final_amount) payment_amount
from ${APP}.dwd_order_detail
where (dt='$do_date'
or dt=date_add('$do_date',-1))
and coupon_id is not null
and order_id in
(
select order_id from ${APP}.dwd_payment_info where dt='$do_date'
)
group by coupon_id
)
insert overwrite table ${APP}.dws_coupon_info_daycount partition(dt='$do_date')
select
coupon_id,
sum(get_count),
sum(order_count),
sum(order_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_reduce_amount),
sum(payment_amount),
sum(expire_count)
from
(
select
coupon_id,
get_count,
order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
0 payment_reduce_amount,
0 payment_amount,
expire_count
from tmp_cu
union all
select
coupon_id,
0 get_count,
0 order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_reduce_amount,
0 payment_amount,
0 expire_count
from tmp_order
union all
select
coupon_id,
0 get_count,
0 order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
payment_reduce_amount,
payment_amount,
0 expire_count
from tmp_pay
)t1
group by coupon_id;"
dws_area_stats_daycount="
with
tmp_vu as
(
select
id province_id,
visit_count,
login_count,
visitor_count,
user_count
from
(
select
area_code,
count(*) visit_count,--訪客訪問次數
count(user_id) login_count,--用戶訪問次數,等價於sum(if(user_id is not null,1,0))
count(distinct(mid_id)) visitor_count,--訪客人數
count(distinct(user_id)) user_count--用戶人數
from ${APP}.dwd_page_log
where dt='$do_date'
and last_page_id is null
group by area_code
)tmp
left join ${APP}.dim_base_province area
on tmp.area_code=area.area_code
),
tmp_order as
(
select
province_id,
count(*) order_count,
sum(original_amount) order_original_amount,
sum(final_amount) order_final_amount
from ${APP}.dwd_order_info
where dt='$do_date'
or dt='9999-99-99'
and date_format(create_time,'yyyy-MM-dd')='$do_date'
group by province_id
),
tmp_pay as
(
select
province_id,
count(*) payment_count,
sum(payment_amount) payment_amount
from ${APP}.dwd_payment_info
where dt='$do_date'
group by province_id
),
tmp_ro as
(
select
province_id,
count(*) refund_order_count,
sum(refund_amount) refund_order_amount
from ${APP}.dwd_order_refund_info
where dt='$do_date'
group by province_id
),
tmp_rp as
(
select
province_id,
count(*) refund_payment_count,
sum(refund_amount) refund_payment_amount
from ${APP}.dwd_refund_payment
where dt='$do_date'
group by province_id
)
insert overwrite table ${APP}.dws_area_stats_daycount partition(dt='$do_date')
select
province_id,
sum(visit_count),
sum(login_count),
sum(visitor_count),
sum(user_count),
sum(order_count),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_amount),
sum(refund_order_count),
sum(refund_order_amount),
sum(refund_payment_count),
sum(refund_payment_amount)
from
(
select
province_id,
visit_count,
login_count,
visitor_count,
user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_vu
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
order_count,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_order
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_pay
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
refund_order_count,
refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_ro
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
refund_payment_count,
refund_payment_amount
from tmp_rp
)t1
group by province_id;"
case $1 in
"dws_visitor_action_daycount" )
hive -e "$dws_visitor_action_daycount"
;;
"dws_user_action_daycount" )
hive -e "$dws_user_action_daycount"
;;
"dws_activity_info_daycount" )
hive -e "$dws_activity_info_daycount"
;;
"dws_area_stats_daycount" )
hive -e "$dws_area_stats_daycount"
;;
"dws_sku_action_daycount" )
hive -e "$dws_sku_action_daycount"
;;
"dws_coupon_info_daycount" )
hive -e "$dws_coupon_info_daycount"
;;
"all" )
hive -e "$dws_visitor_action_daycount$dws_user_action_daycount$dws_activity_info_daycount$dws_area_stats_daycount$dws_sku_action_daycount$dws_coupon_info_daycount"
;;
esac
#!/bin/bash
APP=gmall
# 如果是輸入的日期按照取輸入日期;如果沒輸入日期取當前時間的前一天
if [ -n "$2" ] ;then
do_date=$2
else
do_date=`date -d "-1 day" +%F`
fi
dws_visitor_action_daycount="insert overwrite table ${APP}.dws_visitor_action_daycount partition(dt='$do_date')
select
t1.mid_id,
t1.brand,
t1.model,
t1.is_new,
t1.channel,
t1.os,
t1.area_code,
t1.version_code,
t1.visit_count,
t3.page_stats
from
(
select
mid_id,
brand,
model,
if(array_contains(collect_set(is_new),'0'),'0','1') is_new,--ods_page_log中,同一天內,同一設備的is_new字段,可能全部為1,可能全部為0,也可能部分為0,部分為1(卸載重裝),故做該處理
collect_set(channel) channel,
collect_set(os) os,
collect_set(area_code) area_code,
collect_set(version_code) version_code,
sum(if(last_page_id is null,1,0)) visit_count
from ${APP}.dwd_page_log
where dt='$do_date'
and last_page_id is null
group by mid_id,model,brand
)t1
join
(
select
mid_id,
brand,
model,
collect_set(named_struct('page_id',page_id,'page_count',page_count,'during_time',during_time)) page_stats
from
(
select
mid_id,
brand,
model,
page_id,
count(*) page_count,
sum(during_time) during_time
from ${APP}.dwd_page_log
where dt='$do_date'
group by mid_id,model,brand,page_id
)t2
group by mid_id,model,brand
)t3
on t1.mid_id=t3.mid_id
and t1.brand=t3.brand
and t1.model=t3.model;"
dws_user_action_daycount="
with
tmp_login as
(
select
user_id,
count(*) login_count
from ${APP}.dwd_page_log
where dt='$do_date'
and user_id is not null
and last_page_id is null
group by user_id
),
tmp_cf as
(
select
user_id,
sum(if(action_id='cart_add',1,0)) cart_count,
sum(if(action_id='favor_add',1,0)) favor_count
from ${APP}.dwd_action_log
where dt='$do_date'
and user_id is not null
and action_id in ('cart_add','favor_add')
group by user_id
),
tmp_order as
(
select
user_id,
count(*) order_count,
sum(if(activity_reduce_amount>0,1,0)) order_activity_count,
sum(if(coupon_reduce_amount>0,1,0)) order_coupon_count,
sum(activity_reduce_amount) order_activity_reduce_amount,
sum(coupon_reduce_amount) order_coupon_reduce_amount,
sum(original_amount) order_original_amount,
sum(final_amount) order_final_amount
from ${APP}.dwd_order_info
where (dt='$do_date'
or dt='9999-99-99')
and date_format(create_time,'yyyy-MM-dd')='$do_date'
group by user_id
),
tmp_pay as
(
select
user_id,
count(*) payment_count,
sum(payment_amount) payment_amount
from ${APP}.dwd_payment_info
where dt='$do_date'
group by user_id
),
tmp_ri as
(
select
user_id,
count(*) refund_order_count,
sum(refund_num) refund_order_num,
sum(refund_amount) refund_order_amount
from ${APP}.dwd_order_refund_info
where dt='$do_date'
group by user_id
),
tmp_rp as
(
select
rp.user_id,
count(*) refund_payment_count,
sum(ri.refund_num) refund_payment_num,
sum(rp.refund_amount) refund_payment_amount
from
(
select
user_id,
order_id,
sku_id,
refund_amount
from ${APP}.dwd_refund_payment
where dt='$do_date'
)rp
left join
(
select
user_id,
order_id,
sku_id,
refund_num
from ${APP}.dwd_order_refund_info
where dt>=date_add('$do_date',-15)
)ri
on rp.order_id=ri.order_id
and rp.sku_id=rp.sku_id
group by rp.user_id
),
tmp_coupon as
(
select
user_id,
sum(if(date_format(get_time,'yyyy-MM-dd')='$do_date',1,0)) coupon_get_count,
sum(if(date_format(using_time,'yyyy-MM-dd')='$do_date',1,0)) coupon_using_count,
sum(if(date_format(used_time,'yyyy-MM-dd')='$do_date',1,0)) coupon_used_count
from ${APP}.dwd_coupon_use
where (dt='$do_date' or dt='9999-99-99')
and (date_format(get_time, 'yyyy-MM-dd') = '$do_date'
or date_format(using_time,'yyyy-MM-dd')='$do_date'
or date_format(used_time,'yyyy-MM-dd')='$do_date')
group by user_id
),
tmp_comment as
(
select
user_id,
sum(if(appraise='1201',1,0)) appraise_good_count,
sum(if(appraise='1202',1,0)) appraise_mid_count,
sum(if(appraise='1203',1,0)) appraise_bad_count,
sum(if(appraise='1204',1,0)) appraise_default_count
from ${APP}.dwd_comment_info
where dt='$do_date'
group by user_id
),
tmp_od as
(
select
user_id,
collect_set(named_struct('sku_id',sku_id,'sku_num',sku_num,'order_count',order_count,'activity_reduce_amount',activity_reduce_amount,'coupon_reduce_amount',coupon_reduce_amount,'original_amount',original_amount,'final_amount',final_amount)) order_detail_stats
from
(
select
user_id,
sku_id,
sum(sku_num) sku_num,
count(*) order_count,
cast(sum(split_activity_amount) as decimal(16,2)) activity_reduce_amount,
cast(sum(split_coupon_amount) as decimal(16,2)) coupon_reduce_amount,
cast(sum(original_amount) as decimal(16,2)) original_amount,
cast(sum(split_final_amount) as decimal(16,2)) final_amount
from ${APP}.dwd_order_detail
where dt='$do_date'
group by user_id,sku_id
)t1
group by user_id
)
insert overwrite table ${APP}.dws_user_action_daycount partition(dt='$do_date')
select
coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id,tmp_coupon.user_id,tmp_od.user_id),
nvl(login_count,0),
nvl(cart_count,0),
nvl(favor_count,0),
nvl(order_count,0),
nvl(order_activity_count,0),
nvl(order_activity_reduce_amount,0),
nvl(order_coupon_count,0),
nvl(order_coupon_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_count,0),
nvl(payment_amount,0),
nvl(refund_order_count,0),
nvl(refund_order_num,0),
nvl(refund_order_amount,0),
nvl(refund_payment_count,0),
nvl(refund_payment_num,0),
nvl(refund_payment_amount,0),
nvl(coupon_get_count,0),
nvl(coupon_using_count,0),
nvl(coupon_used_count,0),
nvl(appraise_good_count,0),
nvl(appraise_mid_count,0),
nvl(appraise_bad_count,0),
nvl(appraise_default_count,0),
order_detail_stats
from tmp_login
full outer join tmp_cf on tmp_login.user_id=tmp_cf.user_id
full outer join tmp_order on coalesce(tmp_login.user_id,tmp_cf.user_id)=tmp_order.user_id
full outer join tmp_pay on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id)=tmp_pay.user_id
full outer join tmp_ri on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id)=tmp_ri.user_id
full outer join tmp_rp on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id)=tmp_rp.user_id
full outer join tmp_comment on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id)=tmp_comment.user_id
full outer join tmp_coupon on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id)=tmp_coupon.user_id
full outer join tmp_od on coalesce(tmp_login.user_id,tmp_cf.user_id,tmp_order.user_id,tmp_pay.user_id,tmp_ri.user_id,tmp_rp.user_id,tmp_comment.user_id,tmp_coupon.user_id)=tmp_od.user_id;
"
dws_activity_info_daycount="
with
tmp_order as
(
select
activity_rule_id,
activity_id,
count(*) order_count,
sum(split_activity_amount) order_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from ${APP}.dwd_order_detail
where dt='$do_date'
and activity_id is not null
group by activity_rule_id,activity_id
),
tmp_pay as
(
select
activity_rule_id,
activity_id,
count(*) payment_count,
sum(split_activity_amount) payment_reduce_amount,
sum(split_final_amount) payment_amount
from ${APP}.dwd_order_detail
where (dt='$do_date'
or dt=date_add('$do_date',-1))
and activity_id is not null
and order_id in
(
select order_id from ${APP}.dwd_payment_info where dt='$do_date'
)
group by activity_rule_id,activity_id
)
insert overwrite table ${APP}.dws_activity_info_daycount partition(dt='$do_date')
select
activity_rule_id,
activity_id,
sum(order_count),
sum(order_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_reduce_amount),
sum(payment_amount)
from
(
select
activity_rule_id,
activity_id,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_reduce_amount,
0 payment_amount
from tmp_order
union all
select
activity_rule_id,
activity_id,
0 order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount
from tmp_pay
)t1
group by activity_rule_id,activity_id;"
dws_sku_action_daycount="
with
tmp_order as
(
select
sku_id,
count(*) order_count,
sum(sku_num) order_num,
sum(if(split_activity_amount>0,1,0)) order_activity_count,
sum(if(split_coupon_amount>0,1,0)) order_coupon_count,
sum(split_activity_amount) order_activity_reduce_amount,
sum(split_coupon_amount) order_coupon_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from ${APP}.dwd_order_detail
where dt='$do_date'
group by sku_id
),
tmp_pay as
(
select
sku_id,
count(*) payment_count,
sum(sku_num) payment_num,
sum(split_final_amount) payment_amount
from ${APP}.dwd_order_detail
where (dt='$do_date'
or dt=date_add('$do_date',-1))
and order_id in
(
select order_id from ${APP}.dwd_payment_info where dt='$do_date'
)
group by sku_id
),
tmp_ri as
(
select
sku_id,
count(*) refund_order_count,
sum(refund_num) refund_order_num,
sum(refund_amount) refund_order_amount
from ${APP}.dwd_order_refund_info
where dt='$do_date'
group by sku_id
),
tmp_rp as
(
select
rp.sku_id,
count(*) refund_payment_count,
sum(ri.refund_num) refund_payment_num,
sum(refund_amount) refund_payment_amount
from
(
select
order_id,
sku_id,
refund_amount
from ${APP}.dwd_refund_payment
where dt='$do_date'
)rp
left join
(
select
order_id,
sku_id,
refund_num
from ${APP}.dwd_order_refund_info
where dt>=date_add('$do_date',-15)
)ri
on rp.order_id=ri.order_id
and rp.sku_id=ri.sku_id
group by rp.sku_id
),
tmp_cf as
(
select
item sku_id,
sum(if(action_id='cart_add',1,0)) cart_count,
sum(if(action_id='favor_add',1,0)) favor_count
from ${APP}.dwd_action_log
where dt='$do_date'
and action_id in ('cart_add','favor_add')
group by item
),
tmp_comment as
(
select
sku_id,
sum(if(appraise='1201',1,0)) appraise_good_count,
sum(if(appraise='1202',1,0)) appraise_mid_count,
sum(if(appraise='1203',1,0)) appraise_bad_count,
sum(if(appraise='1204',1,0)) appraise_default_count
from ${APP}.dwd_comment_info
where dt='$do_date'
group by sku_id
)
insert overwrite table ${APP}.dws_sku_action_daycount partition(dt='$do_date')
select
sku_id,
sum(order_count),
sum(order_num),
sum(order_activity_count),
sum(order_coupon_count),
sum(order_activity_reduce_amount),
sum(order_coupon_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_num),
sum(payment_amount),
sum(refund_order_count),
sum(refund_order_num),
sum(refund_order_amount),
sum(refund_payment_count),
sum(refund_payment_num),
sum(refund_payment_amount),
sum(cart_count),
sum(favor_count),
sum(appraise_good_count),
sum(appraise_mid_count),
sum(appraise_bad_count),
sum(appraise_default_count)
from
(
select
sku_id,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_order
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_num,
payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_pay
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_ri
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
0 cart_count,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_rp
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
cart_count,
favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_cf
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_activity_count,
0 order_coupon_count,
0 order_activity_reduce_amount,
0 order_coupon_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_order_count,
0 refund_order_num,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_num,
0 refund_payment_amount,
0 cart_count,
0 favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from tmp_comment
)t1
group by sku_id;"
dws_coupon_info_daycount="
with
tmp_cu as
(
select
coupon_id,
sum(if(date_format(get_time,'yyyy-MM-dd')='$do_date',1,0)) get_count,
sum(if(date_format(using_time,'yyyy-MM-dd')='$do_date',1,0)) order_count,
sum(if(date_format(used_time,'yyyy-MM-dd')='$do_date',1,0)) payment_count,
sum(if(date_format(expire_time,'yyyy-MM-dd')='$do_date',1,0)) expire_count
from ${APP}.dwd_coupon_use
where dt='9999-99-99'
or dt='$do_date'
group by coupon_id
),
tmp_order as
(
select
coupon_id,
sum(split_coupon_amount) order_reduce_amount,
sum(original_amount) order_original_amount,
sum(split_final_amount) order_final_amount
from ${APP}.dwd_order_detail
where dt='$do_date'
and coupon_id is not null
group by coupon_id
),
tmp_pay as
(
select
coupon_id,
sum(split_coupon_amount) payment_reduce_amount,
sum(split_final_amount) payment_amount
from ${APP}.dwd_order_detail
where (dt='$do_date'
or dt=date_add('$do_date',-1))
and coupon_id is not null
and order_id in
(
select order_id from ${APP}.dwd_payment_info where dt='$do_date'
)
group by coupon_id
)
insert overwrite table ${APP}.dws_coupon_info_daycount partition(dt='$do_date')
select
coupon_id,
sum(get_count),
sum(order_count),
sum(order_reduce_amount),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_reduce_amount),
sum(payment_amount),
sum(expire_count)
from
(
select
coupon_id,
get_count,
order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
payment_count,
0 payment_reduce_amount,
0 payment_amount,
expire_count
from tmp_cu
union all
select
coupon_id,
0 get_count,
0 order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_reduce_amount,
0 payment_amount,
0 expire_count
from tmp_order
union all
select
coupon_id,
0 get_count,
0 order_count,
0 order_reduce_amount,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
payment_reduce_amount,
payment_amount,
0 expire_count
from tmp_pay
)t1
group by coupon_id;"
dws_area_stats_daycount="
with
tmp_vu as
(
select
id province_id,
visit_count,
login_count,
visitor_count,
user_count
from
(
select
area_code,
count(*) visit_count,--訪客訪問次數
count(user_id) login_count,--用戶訪問次數,等價於sum(if(user_id is not null,1,0))
count(distinct(mid_id)) visitor_count,--訪客人數
count(distinct(user_id)) user_count--用戶人數
from ${APP}.dwd_page_log
where dt='$do_date'
and last_page_id is null
group by area_code
)tmp
left join ${APP}.dim_base_province area
on tmp.area_code=area.area_code
),
tmp_order as
(
select
province_id,
count(*) order_count,
sum(original_amount) order_original_amount,
sum(final_amount) order_final_amount
from ${APP}.dwd_order_info
where dt='$do_date'
or dt='9999-99-99'
and date_format(create_time,'yyyy-MM-dd')='$do_date'
group by province_id
),
tmp_pay as
(
select
province_id,
count(*) payment_count,
sum(payment_amount) payment_amount
from ${APP}.dwd_payment_info
where dt='$do_date'
group by province_id
),
tmp_ro as
(
select
province_id,
count(*) refund_order_count,
sum(refund_amount) refund_order_amount
from ${APP}.dwd_order_refund_info
where dt='$do_date'
group by province_id
),
tmp_rp as
(
select
province_id,
count(*) refund_payment_count,
sum(refund_amount) refund_payment_amount
from ${APP}.dwd_refund_payment
where dt='$do_date'
group by province_id
)
insert overwrite table ${APP}.dws_area_stats_daycount partition(dt='$do_date')
select
province_id,
sum(visit_count),
sum(login_count),
sum(visitor_count),
sum(user_count),
sum(order_count),
sum(order_original_amount),
sum(order_final_amount),
sum(payment_count),
sum(payment_amount),
sum(refund_order_count),
sum(refund_order_amount),
sum(refund_payment_count),
sum(refund_payment_amount)
from
(
select
province_id,
visit_count,
login_count,
visitor_count,
user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_vu
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
order_count,
order_original_amount,
order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_order
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
payment_count,
payment_amount,
0 refund_order_count,
0 refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_pay
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
refund_order_count,
refund_order_amount,
0 refund_payment_count,
0 refund_payment_amount
from tmp_ro
union all
select
province_id,
0 visit_count,
0 login_count,
0 visitor_count,
0 user_count,
0 order_count,
0 order_original_amount,
0 order_final_amount,
0 payment_count,
0 payment_amount,
0 refund_order_count,
0 refund_order_amount,
refund_payment_count,
refund_payment_amount
from tmp_rp
)t1
group by province_id;"
case $1 in
"dws_visitor_action_daycount" )
hive -e "$dws_visitor_action_daycount"
;;
"dws_user_action_daycount" )
hive -e "$dws_user_action_daycount"
;;
"dws_activity_info_daycount" )
hive -e "$dws_activity_info_daycount"
;;
"dws_area_stats_daycount" )
hive -e "$dws_area_stats_daycount"
;;
"dws_sku_action_daycount" )
hive -e "$dws_sku_action_daycount"
;;
"dws_coupon_info_daycount" )
hive -e "$dws_coupon_info_daycount"
;;
"all" )
hive -e "$dws_visitor_action_daycount$dws_user_action_daycount$dws_activity_info_daycount$dws_area_stats_daycount$dws_sku_action_daycount$dws_coupon_info_daycount"
;;
esac
(2)增加執行權限
[atguigu@hadoop102 bin]$ chmod +x dwd_to_dws.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 bin]$ dwd_to_dws.sh all 2020-06-14
(2)查看數據是否導入成功
第五章 數倉搭建-DWT層
在DWS層的搭建中,我們把不同的主體按照天進行了聚合,獲得了每天每個主題的相關事實度量數據。在DWT層中,我們將會把這些不同的主題進行進一步匯總,獲得每個主題的全量數據表。
DWT層主題寬表記錄的字段包括每個維度關聯的不同事實表度量值、累計某個時間段的度量值,以及首次時間、末次時間、累計至今的度量值。
5.1 設備主題寬表
DWT層的設備主題寬表將在每日設備行為表的基礎上進行進一步匯總,獲得每台設備對應的詳細信息,每天將新增的設備信息增加到設備主題寬表中,並添加首次訪問時間,末次訪問時間、累積訪問次數等信息,方便后續實現與設備相關的需求。
1)建表語句
DROP TABLE IF EXISTS dwt_visitor_topic;
CREATE EXTERNAL TABLE dwt_visitor_topic
(
`mid_id` STRING COMMENT '設備id',
`brand` STRING COMMENT '手機品牌',
`model` STRING COMMENT '手機型號',
`channel` ARRAY<STRING> COMMENT '渠道',
`os` ARRAY<STRING> COMMENT '操作系統',
`area_code` ARRAY<STRING> COMMENT '地區ID',
`version_code` ARRAY<STRING> COMMENT '應用版本',
`visit_date_first` STRING COMMENT '首次訪問時間',
`visit_date_last` STRING COMMENT '末次訪問時間',
`visit_last_1d_count` BIGINT COMMENT '最近1日訪問次數',
`visit_last_1d_day_count` BIGINT COMMENT '最近1日訪問天數',
`visit_last_7d_count` BIGINT COMMENT '最近7日訪問次數',
`visit_last_7d_day_count` BIGINT COMMENT '最近7日訪問天數',
`visit_last_30d_count` BIGINT COMMENT '最近30日訪問次數',
`visit_last_30d_day_count` BIGINT COMMENT '最近30日訪問天數',
`visit_count` BIGINT COMMENT '累積訪問次數',
`visit_day_count` BIGINT COMMENT '累積訪問天數'
) COMMENT '設備主題寬表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwt/dwt_visitor_topic'
TBLPROPERTIES ("parquet.compression"="lzo");
2)數據裝載
insert overwrite table dwt_visitor_topic partition(dt='2020-06-14')
select
nvl(1d_ago.mid_id,old.mid_id),
nvl(1d_ago.brand,old.brand),
nvl(1d_ago.model,old.model),
nvl(1d_ago.channel,old.channel),
nvl(1d_ago.os,old.os),
nvl(1d_ago.area_code,old.area_code),
nvl(1d_ago.version_code,old.version_code),
case when old.mid_id is null and 1d_ago.is_new=1 then '2020-06-14'
when old.mid_id is null and 1d_ago.is_new=0 then '2020-06-13'-- 無法獲取准確的首次登錄日期,給定一個數倉搭建日之前的日期
else old.visit_date_first end,
if(1d_ago.mid_id is not null,'2020-06-14',old.visit_date_last),
nvl(1d_ago.visit_count,0),
if(1d_ago.mid_id is null,0,1),
nvl(old.visit_last_7d_count,0)+nvl(1d_ago.visit_count,0)- nvl(7d_ago.visit_count,0),
nvl(old.visit_last_7d_day_count,0)+if(1d_ago.mid_id is null,0,1)- if(7d_ago.mid_id is null,0,1),
nvl(old.visit_last_30d_count,0)+nvl(1d_ago.visit_count,0)- nvl(30d_ago.visit_count,0),
nvl(old.visit_last_30d_day_count,0)+if(1d_ago.mid_id is null,0,1)- if(30d_ago.mid_id is null,0,1),
nvl(old.visit_count,0)+nvl(1d_ago.visit_count,0),
nvl(old.visit_day_count,0)+if(1d_ago.mid_id is null,0,1)
from
(
select
mid_id,
brand,
model,
channel,
os,
area_code,
version_code,
visit_date_first,
visit_date_last,
visit_last_1d_count,
visit_last_1d_day_count,
visit_last_7d_count,
visit_last_7d_day_count,
visit_last_30d_count,
visit_last_30d_day_count,
visit_count,
visit_day_count
from dwt_visitor_topic
where dt=date_add('2020-06-14',-1)
)old
full outer join
(
select
mid_id,
brand,
model,
is_new,
channel,
os,
area_code,
version_code,
visit_count
from dws_visitor_action_daycount
where dt='2020-06-14'
)1d_ago
on old.mid_id=1d_ago.mid_id
left join
(
select
mid_id,
brand,
model,
is_new,
channel,
os,
area_code,
version_code,
visit_count
from dws_visitor_action_daycount
where dt=date_add('2020-06-14',-7)
)7d_ago
on old.mid_id=7d_ago.mid_id
left join
(
select
mid_id,
brand,
model,
is_new,
channel,
os,
area_code,
version_code,
visit_count
from dws_visitor_action_daycount
where dt=date_add('2020-06-14',-30)
)30d_ago
on old.mid_id=30d_ago.mid_id;
3)查詢加載結果
5.2 用戶主題寬表
1)建表語句
DROP TABLE IF EXISTS dwt_user_topic;
CREATE EXTERNAL TABLE dwt_user_topic
(
`user_id` STRING COMMENT '用戶id',
`login_date_first` STRING COMMENT '首次活躍日期',
`login_date_last` STRING COMMENT '末次活躍日期',
`login_date_1d_count` STRING COMMENT '最近1日登錄次數',
`login_last_1d_day_count` BIGINT COMMENT '最近1日登錄天數',
`login_last_7d_count` BIGINT COMMENT '最近7日登錄次數',
`login_last_7d_day_count` BIGINT COMMENT '最近7日登錄天數',
`login_last_30d_count` BIGINT COMMENT '最近30日登錄次數',
`login_last_30d_day_count` BIGINT COMMENT '最近30日登錄天數',
`login_count` BIGINT COMMENT '累積登錄次數',
`login_day_count` BIGINT COMMENT '累積登錄天數',
`order_date_first` STRING COMMENT '首次下單時間',
`order_date_last` STRING COMMENT '末次下單時間',
`order_last_1d_count` BIGINT COMMENT '最近1日下單次數',
`order_activity_last_1d_count` BIGINT COMMENT '最近1日訂單參與活動次數',
`order_activity_reduce_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日訂單減免金額(活動)',
`order_coupon_last_1d_count` BIGINT COMMENT '最近1日下單用券次數',
`order_coupon_reduce_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日訂單減免金額(優惠券)',
`order_last_1d_original_amount` DECIMAL(16,2) COMMENT '最近1日原始下單金額',
`order_last_1d_final_amount` DECIMAL(16,2) COMMENT '最近1日最終下單金額',
`order_last_7d_count` BIGINT COMMENT '最近7日下單次數',
`order_activity_last_7d_count` BIGINT COMMENT '最近7日訂單參與活動次數',
`order_activity_reduce_last_7d_amount` DECIMAL(16,2) COMMENT '最近7日訂單減免金額(活動)',
`order_coupon_last_7d_count` BIGINT COMMENT '最近7日下單用券次數',
`order_coupon_reduce_last_7d_amount` DECIMAL(16,2) COMMENT '最近7日訂單減免金額(優惠券)',
`order_last_7d_original_amount` DECIMAL(16,2) COMMENT '最近7日原始下單金額',
`order_last_7d_final_amount` DECIMAL(16,2) COMMENT '最近7日最終下單金額',
`order_last_30d_count` BIGINT COMMENT '最近30日下單次數',
`order_activity_last_30d_count` BIGINT COMMENT '最近30日訂單參與活動次數',
`order_activity_reduce_last_30d_amount` DECIMAL(16,2) COMMENT '最近30日訂單減免金額(活動)',
`order_coupon_last_30d_count` BIGINT COMMENT '最近30日下單用券次數',
`order_coupon_reduce_last_30d_amount` DECIMAL(16,2) COMMENT '最近30日訂單減免金額(優惠券)',
`order_last_30d_original_amount` DECIMAL(16,2) COMMENT '最近30日原始下單金額',
`order_last_30d_final_amount` DECIMAL(16,2) COMMENT '最近30日最終下單金額',
`order_count` BIGINT COMMENT '累積下單次數',
`order_activity_count` BIGINT COMMENT '累積訂單參與活動次數',
`order_activity_reduce_amount` DECIMAL(16,2) COMMENT '累積訂單減免金額(活動)',
`order_coupon_count` BIGINT COMMENT '累積下單用券次數',
`order_coupon_reduce_amount` DECIMAL(16,2) COMMENT '累積訂單減免金額(優惠券)',
`order_original_amount` DECIMAL(16,2) COMMENT '累積原始下單金額',
`order_final_amount` DECIMAL(16,2) COMMENT '累積最終下單金額',
`payment_date_first` STRING COMMENT '首次支付時間',
`payment_date_last` STRING COMMENT '末次支付時間',
`payment_last_1d_count` BIGINT COMMENT '最近1日支付次數',
`payment_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日支付金額',
`payment_last_7d_count` BIGINT COMMENT '最近7日支付次數',
`payment_last_7d_amount` DECIMAL(16,2) COMMENT '最近7日支付金額',
`payment_last_30d_count` BIGINT COMMENT '最近30日支付次數',
`payment_last_30d_amount` DECIMAL(16,2) COMMENT '最近30日支付金額',
`payment_count` BIGINT COMMENT '累積支付次數',
`payment_amount` DECIMAL(16,2) COMMENT '累積支付金額',
`refund_order_last_1d_count` BIGINT COMMENT '最近1日退單次數',
`refund_order_last_1d_num` BIGINT COMMENT '最近1日退單件數',
`refund_order_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日退單金額',
`refund_order_last_7d_count` BIGINT COMMENT '最近7日退單次數',
`refund_order_last_7d_num` BIGINT COMMENT '最近7日退單件數',
`refund_order_last_7d_amount` DECIMAL(16,2) COMMENT '最近7日退單金額',
`refund_order_last_30d_count` BIGINT COMMENT '最近30日退單次數',
`refund_order_last_30d_num` BIGINT COMMENT '最近30日退單件數',
`refund_order_last_30d_amount` DECIMAL(16,2) COMMENT '最近30日退單金額',
`refund_order_count` BIGINT COMMENT '累積退單次數',
`refund_order_num` BIGINT COMMENT '累積退單件數',
`refund_order_amount` DECIMAL(16,2) COMMENT '累積退單金額',
`refund_payment_last_1d_count` BIGINT COMMENT '最近1日退款次數',
`refund_payment_last_1d_num` BIGINT COMMENT '最近1日退款件數',
`refund_payment_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日退款金額',
`refund_payment_last_7d_count` BIGINT COMMENT '最近7日退款次數',
`refund_payment_last_7d_num` BIGINT COMMENT '最近7日退款件數',
`refund_payment_last_7d_amount` DECIMAL(16,2) COMMENT '最近7日退款金額',
`refund_payment_last_30d_count` BIGINT COMMENT '最近30日退款次數',
`refund_payment_last_30d_num` BIGINT COMMENT '最近30日退款件數',
`refund_payment_last_30d_amount` DECIMAL(16,2) COMMENT '最近30日退款金額',
`refund_payment_count` BIGINT COMMENT '累積退款次數',
`refund_payment_num` BIGINT COMMENT '累積退款件數',
`refund_payment_amount` DECIMAL(16,2) COMMENT '累積退款金額',
`cart_last_1d_count` BIGINT COMMENT '最近1日加入購物車次數',
`cart_last_7d_count` BIGINT COMMENT '最近7日加入購物車次數',
`cart_last_30d_count` BIGINT COMMENT '最近30日加入購物車次數',
`cart_count` BIGINT COMMENT '累積加入購物車次數',
`favor_last_1d_count` BIGINT COMMENT '最近1日收藏次數',
`favor_last_7d_count` BIGINT COMMENT '最近7日收藏次數',
`favor_last_30d_count` BIGINT COMMENT '最近30日收藏次數',
`favor_count` BIGINT COMMENT '累積收藏次數',
`coupon_last_1d_get_count` BIGINT COMMENT '最近1日領券次數',
`coupon_last_1d_using_count` BIGINT COMMENT '最近1日用券(下單)次數',
`coupon_last_1d_used_count` BIGINT COMMENT '最近1日用券(支付)次數',
`coupon_last_7d_get_count` BIGINT COMMENT '最近7日領券次數',
`coupon_last_7d_using_count` BIGINT COMMENT '最近7日用券(下單)次數',
`coupon_last_7d_used_count` BIGINT COMMENT '最近7日用券(支付)次數',
`coupon_last_30d_get_count` BIGINT COMMENT '最近30日領券次數',
`coupon_last_30d_using_count` BIGINT COMMENT '最近30日用券(下單)次數',
`coupon_last_30d_used_count` BIGINT COMMENT '最近30日用券(支付)次數',
`coupon_get_count` BIGINT COMMENT '累積領券次數',
`coupon_using_count` BIGINT COMMENT '累積用券(下單)次數',
`coupon_used_count` BIGINT COMMENT '累積用券(支付)次數',
`appraise_last_1d_good_count` BIGINT COMMENT '最近1日好評次數',
`appraise_last_1d_mid_count` BIGINT COMMENT '最近1日中評次數',
`appraise_last_1d_bad_count` BIGINT COMMENT '最近1日差評次數',
`appraise_last_1d_default_count` BIGINT COMMENT '最近1日默認評價次數',
`appraise_last_7d_good_count` BIGINT COMMENT '最近7日好評次數',
`appraise_last_7d_mid_count` BIGINT COMMENT '最近7日中評次數',
`appraise_last_7d_bad_count` BIGINT COMMENT '最近7日差評次數',
`appraise_last_7d_default_count` BIGINT COMMENT '最近7日默認評價次數',
`appraise_last_30d_good_count` BIGINT COMMENT '最近30日好評次數',
`appraise_last_30d_mid_count` BIGINT COMMENT '最近30日中評次數',
`appraise_last_30d_bad_count` BIGINT COMMENT '最近30日差評次數',
`appraise_last_30d_default_count` BIGINT COMMENT '最近30日默認評價次數',
`appraise_good_count` BIGINT COMMENT '累積好評次數',
`appraise_mid_count` BIGINT COMMENT '累積中評次數',
`appraise_bad_count` BIGINT COMMENT '累積差評次數',
`appraise_default_count` BIGINT COMMENT '累積默認評價次數'
)COMMENT '用戶主題寬表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwt/dwt_user_topic/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)數據裝載

(1)首日裝載
insert overwrite table dwt_user_topic partition(dt='2020-06-14')
select
id,
login_date_first,-- 以用戶的創建日期作為首次登錄日期
nvl(login_date_last,date_add('2020-06-14',-1)),-- 若有歷史登錄記錄,則根據歷史記錄獲取末次登錄日期,否則統一指定一個日期
nvl(login_last_1d_count,0),
nvl(login_last_1d_day_count,0),
nvl(login_last_7d_count,0),
nvl(login_last_7d_day_count,0),
nvl(login_last_30d_count,0),
nvl(login_last_30d_day_count,0),
nvl(login_count,0),
nvl(login_day_count,0),
order_date_first,
order_date_last,
nvl(order_last_1d_count,0),
nvl(order_activity_last_1d_count,0),
nvl(order_activity_reduce_last_1d_amount,0),
nvl(order_coupon_last_1d_count,0),
nvl(order_coupon_reduce_last_1d_amount,0),
nvl(order_last_1d_original_amount,0),
nvl(order_last_1d_final_amount,0),
nvl(order_last_7d_count,0),
nvl(order_activity_last_7d_count,0),
nvl(order_activity_reduce_last_7d_amount,0),
nvl(order_coupon_last_7d_count,0),
nvl(order_coupon_reduce_last_7d_amount,0),
nvl(order_last_7d_original_amount,0),
nvl(order_last_7d_final_amount,0),
nvl(order_last_30d_count,0),
nvl(order_activity_last_30d_count,0),
nvl(order_activity_reduce_last_30d_amount,0),
nvl(order_coupon_last_30d_count,0),
nvl(order_coupon_reduce_last_30d_amount,0),
nvl(order_last_30d_original_amount,0),
nvl(order_last_30d_final_amount,0),
nvl(order_count,0),
nvl(order_activity_count,0),
nvl(order_activity_reduce_amount,0),
nvl(order_coupon_count,0),
nvl(order_coupon_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
payment_date_first,
payment_date_last,
nvl(payment_last_1d_count,0),
nvl(payment_last_1d_amount,0),
nvl(payment_last_7d_count,0),
nvl(payment_last_7d_amount,0),
nvl(payment_last_30d_count,0),
nvl(payment_last_30d_amount,0),
nvl(payment_count,0),
nvl(payment_amount,0),
nvl(refund_order_last_1d_count,0),
nvl(refund_order_last_1d_num,0),
nvl(refund_order_last_1d_amount,0),
nvl(refund_order_last_7d_count,0),
nvl(refund_order_last_7d_num,0),
nvl(refund_order_last_7d_amount,0),
nvl(refund_order_last_30d_count,0),
nvl(refund_order_last_30d_num,0),
nvl(refund_order_last_30d_amount,0),
nvl(refund_order_count,0),
nvl(refund_order_num,0),
nvl(refund_order_amount,0),
nvl(refund_payment_last_1d_count,0),
nvl(refund_payment_last_1d_num,0),
nvl(refund_payment_last_1d_amount,0),
nvl(refund_payment_last_7d_count,0),
nvl(refund_payment_last_7d_num,0),
nvl(refund_payment_last_7d_amount,0),
nvl(refund_payment_last_30d_count,0),
nvl(refund_payment_last_30d_num,0),
nvl(refund_payment_last_30d_amount,0),
nvl(refund_payment_count,0),
nvl(refund_payment_num,0),
nvl(refund_payment_amount,0),
nvl(cart_last_1d_count,0),
nvl(cart_last_7d_count,0),
nvl(cart_last_30d_count,0),
nvl(cart_count,0),
nvl(favor_last_1d_count,0),
nvl(favor_last_7d_count,0),
nvl(favor_last_30d_count,0),
nvl(favor_count,0),
nvl(coupon_last_1d_get_count,0),
nvl(coupon_last_1d_using_count,0),
nvl(coupon_last_1d_used_count,0),
nvl(coupon_last_7d_get_count,0),
nvl(coupon_last_7d_using_count,0),
nvl(coupon_last_7d_used_count,0),
nvl(coupon_last_30d_get_count,0),
nvl(coupon_last_30d_using_count,0),
nvl(coupon_last_30d_used_count,0),
nvl(coupon_get_count,0),
nvl(coupon_using_count,0),
nvl(coupon_used_count,0),
nvl(appraise_last_1d_good_count,0),
nvl(appraise_last_1d_mid_count,0),
nvl(appraise_last_1d_bad_count,0),
nvl(appraise_last_1d_default_count,0),
nvl(appraise_last_7d_good_count,0),
nvl(appraise_last_7d_mid_count,0),
nvl(appraise_last_7d_bad_count,0),
nvl(appraise_last_7d_default_count,0),
nvl(appraise_last_30d_good_count,0),
nvl(appraise_last_30d_mid_count,0),
nvl(appraise_last_30d_bad_count,0),
nvl(appraise_last_30d_default_count,0),
nvl(appraise_good_count,0),
nvl(appraise_mid_count,0),
nvl(appraise_bad_count,0),
nvl(appraise_default_count,0)
from
(
select
id,
date_format(create_time,'yyyy-MM-dd') login_date_first
from dim_user_info
where dt='9999-99-99'
)t1
left join
(
select
user_id user_id,
max(dt) login_date_last,
sum(if(dt='2020-06-14',login_count,0)) login_last_1d_count,
sum(if(dt='2020-06-14' and login_count>0,1,0)) login_last_1d_day_count,
sum(if(dt>=date_add('2020-06-14',-6),login_count,0)) login_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6) and login_count>0,1,0)) login_last_7d_day_count,
sum(if(dt>=date_add('2020-06-14',-29),login_count,0)) login_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29) and login_count>0,1,0)) login_last_30d_day_count,
sum(login_count) login_count,
sum(if(login_count>0,1,0)) login_day_count,
min(if(order_count>0,dt,null)) order_date_first,
max(if(order_count>0,dt,null)) order_date_last,
sum(if(dt='2020-06-14',order_count,0)) order_last_1d_count,
sum(if(dt='2020-06-14',order_activity_count,0)) order_activity_last_1d_count,
sum(if(dt='2020-06-14',order_activity_reduce_amount,0)) order_activity_reduce_last_1d_amount,
sum(if(dt='2020-06-14',order_coupon_count,0)) order_coupon_last_1d_count,
sum(if(dt='2020-06-14',order_coupon_reduce_amount,0)) order_coupon_reduce_last_1d_amount,
sum(if(dt='2020-06-14',order_original_amount,0)) order_last_1d_original_amount,
sum(if(dt='2020-06-14',order_final_amount,0)) order_last_1d_final_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_count,0)) order_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),order_activity_count,0)) order_activity_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),order_activity_reduce_amount,0)) order_activity_reduce_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_coupon_count,0)) order_coupon_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),order_coupon_reduce_amount,0)) order_coupon_reduce_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_original_amount,0)) order_last_7d_original_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_final_amount,0)) order_last_7d_final_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_count,0)) order_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),order_activity_count,0)) order_activity_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),order_activity_reduce_amount,0)) order_activity_reduce_last_30d_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_coupon_count,0)) order_coupon_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),order_coupon_reduce_amount,0)) order_coupon_reduce_last_30d_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_original_amount,0)) order_last_30d_original_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_final_amount,0)) order_last_30d_final_amount,
sum(order_count) order_count,
sum(order_activity_count) order_activity_count,
sum(order_activity_reduce_amount) order_activity_reduce_amount,
sum(order_coupon_count) order_coupon_count,
sum(order_coupon_reduce_amount) order_coupon_reduce_amount,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
min(if(payment_count>0,dt,null)) payment_date_first,
max(if(payment_count>0,dt,null)) payment_date_last,
sum(if(dt='2020-06-14',payment_count,0)) payment_last_1d_count,
sum(if(dt='2020-06-14',payment_amount,0)) payment_last_1d_amount,
sum(if(dt>=date_add('2020-06-14',-6),payment_count,0)) payment_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),payment_amount,0)) payment_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-29),payment_count,0)) payment_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),payment_amount,0)) payment_last_30d_amount,
sum(payment_count) payment_count,
sum(payment_amount) payment_amount,
sum(if(dt='2020-06-14',refund_order_count,0)) refund_order_last_1d_count,
sum(if(dt='2020-06-14',refund_order_num,0)) refund_order_last_1d_num,
sum(if(dt='2020-06-14',refund_order_amount,0)) refund_order_last_1d_amount,
sum(if(dt>=date_add('2020-06-14',-6),refund_order_count,0)) refund_order_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),refund_order_num,0)) refund_order_last_7d_num,
sum(if(dt>=date_add('2020-06-14',-6),refund_order_amount,0)) refund_order_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-29),refund_order_count,0)) refund_order_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),refund_order_num,0)) refund_order_last_30d_num,
sum(if(dt>=date_add('2020-06-14',-29),refund_order_amount,0)) refund_order_last_30d_amount,
sum(refund_order_count) refund_order_count,
sum(refund_order_num) refund_order_num,
sum(refund_order_amount) refund_order_amount,
sum(if(dt='2020-06-14',refund_payment_count,0)) refund_payment_last_1d_count,
sum(if(dt='2020-06-14',refund_payment_num,0)) refund_payment_last_1d_num,
sum(if(dt='2020-06-14',refund_payment_amount,0)) refund_payment_last_1d_amount,
sum(if(dt>=date_add('2020-06-14',-6),refund_payment_count,0)) refund_payment_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),refund_payment_num,0)) refund_payment_last_7d_num,
sum(if(dt>=date_add('2020-06-14',-6),refund_payment_amount,0)) refund_payment_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-29),refund_payment_count,0)) refund_payment_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),refund_payment_num,0)) refund_payment_last_30d_num,
sum(if(dt>=date_add('2020-06-14',-29),refund_payment_amount,0)) refund_payment_last_30d_amount,
sum(refund_payment_count) refund_payment_count,
sum(refund_payment_num) refund_payment_num,
sum(refund_payment_amount) refund_payment_amount,
sum(if(dt='2020-06-14',cart_count,0)) cart_last_1d_count,
sum(if(dt>=date_add('2020-06-14',-6),cart_count,0)) cart_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-29),cart_count,0)) cart_last_30d_count,
sum(cart_count) cart_count,
sum(if(dt='2020-06-14',favor_count,0)) favor_last_1d_count,
sum(if(dt>=date_add('2020-06-14',-6),favor_count,0)) favor_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-29),favor_count,0)) favor_last_30d_count,
sum(favor_count) favor_count,
sum(if(dt='2020-06-14',coupon_get_count,0)) coupon_last_1d_get_count,
sum(if(dt='2020-06-14',coupon_using_count,0)) coupon_last_1d_using_count,
sum(if(dt='2020-06-14',coupon_used_count,0)) coupon_last_1d_used_count,
sum(if(dt>=date_add('2020-06-14',-6),coupon_get_count,0)) coupon_last_7d_get_count,
sum(if(dt>=date_add('2020-06-14',-6),coupon_using_count,0)) coupon_last_7d_using_count,
sum(if(dt>=date_add('2020-06-14',-6),coupon_used_count,0)) coupon_last_7d_used_count,
sum(if(dt>=date_add('2020-06-14',-29),coupon_get_count,0)) coupon_last_30d_get_count,
sum(if(dt>=date_add('2020-06-14',-29),coupon_using_count,0)) coupon_last_30d_using_count,
sum(if(dt>=date_add('2020-06-14',-29),coupon_used_count,0)) coupon_last_30d_used_count,
sum(coupon_get_count) coupon_get_count,
sum(coupon_using_count) coupon_using_count,
sum(coupon_used_count) coupon_used_count,
sum(if(dt='2020-06-14',appraise_good_count,0)) appraise_last_1d_good_count,
sum(if(dt='2020-06-14',appraise_mid_count,0)) appraise_last_1d_mid_count,
sum(if(dt='2020-06-14',appraise_bad_count,0)) appraise_last_1d_bad_count,
sum(if(dt='2020-06-14',appraise_default_count,0)) appraise_last_1d_default_count,
sum(if(dt>=date_add('2020-06-14',-6),appraise_good_count,0)) appraise_last_7d_good_count,
sum(if(dt>=date_add('2020-06-14',-6),appraise_mid_count,0)) appraise_last_7d_mid_count,
sum(if(dt>=date_add('2020-06-14',-6),appraise_bad_count,0)) appraise_last_7d_bad_count,
sum(if(dt>=date_add('2020-06-14',-6),appraise_default_count,0)) appraise_last_7d_default_count,
sum(if(dt>=date_add('2020-06-14',-29),appraise_good_count,0)) appraise_last_30d_good_count,
sum(if(dt>=date_add('2020-06-14',-29),appraise_mid_count,0)) appraise_last_30d_mid_count,
sum(if(dt>=date_add('2020-06-14',-29),appraise_bad_count,0)) appraise_last_30d_bad_count,
sum(if(dt>=date_add('2020-06-14',-29),appraise_default_count,0)) appraise_last_30d_default_count,
sum(appraise_good_count) appraise_good_count,
sum(appraise_mid_count) appraise_mid_count,
sum(appraise_bad_count) appraise_bad_count,
sum(appraise_default_count) appraise_default_count
from dws_user_action_daycount
group by user_id
)t2
on t1.id=t2.user_id;
(2)每日裝載

insert overwrite table dwt_user_topic partition(dt='2020-06-15')
select
nvl(1d_ago.user_id,old.user_id),
nvl(old.login_date_first,'2020-06-15'),
if(1d_ago.user_id is not null,'2020-06-15',old.login_date_last),
nvl(1d_ago.login_count,0),
if(1d_ago.user_id is not null,1,0),
nvl(old.login_last_7d_count,0)+nvl(1d_ago.login_count,0)- nvl(7d_ago.login_count,0),
nvl(old.login_last_7d_day_count,0)+if(1d_ago.user_id is null,0,1)- if(7d_ago.user_id is null,0,1),
nvl(old.login_last_30d_count,0)+nvl(1d_ago.login_count,0)- nvl(30d_ago.login_count,0),
nvl(old.login_last_30d_day_count,0)+if(1d_ago.user_id is null,0,1)- if(30d_ago.user_id is null,0,1),
nvl(old.login_count,0)+nvl(1d_ago.login_count,0),
nvl(old.login_day_count,0)+if(1d_ago.user_id is not null,1,0),
if(old.order_date_first is null and 1d_ago.order_count>0, '2020-06-15', old.order_date_first),
if(1d_ago.order_count>0,'2020-06-15',old.order_date_last),
nvl(1d_ago.order_count,0),
nvl(1d_ago.order_activity_count,0),
nvl(1d_ago.order_activity_reduce_amount,0.0),
nvl(1d_ago.order_coupon_count,0),
nvl(1d_ago.order_coupon_reduce_amount,0.0),
nvl(1d_ago.order_original_amount,0.0),
nvl(1d_ago.order_final_amount,0.0),
nvl(old.order_last_7d_count,0)+nvl(1d_ago.order_count,0)- nvl(7d_ago.order_count,0),
nvl(old.order_activity_last_7d_count,0)+nvl(1d_ago.order_activity_count,0)- nvl(7d_ago.order_activity_count,0),
nvl(old.order_activity_reduce_last_7d_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0)- nvl(7d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_last_7d_count,0)+nvl(1d_ago.order_coupon_count,0)- nvl(7d_ago.order_coupon_count,0),
nvl(old.order_coupon_reduce_last_7d_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0)- nvl(7d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_last_7d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(7d_ago.order_original_amount,0.0),
nvl(old.order_last_7d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(7d_ago.order_final_amount,0.0),
nvl(old.order_last_30d_count,0)+nvl(1d_ago.order_count,0)- nvl(30d_ago.order_count,0),
nvl(old.order_activity_last_30d_count,0)+nvl(1d_ago.order_activity_count,0)- nvl(30d_ago.order_activity_count,0),
nvl(old.order_activity_reduce_last_30d_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0)- nvl(30d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_last_30d_count,0)+nvl(1d_ago.order_coupon_count,0)- nvl(30d_ago.order_coupon_count,0),
nvl(old.order_coupon_reduce_last_30d_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0)- nvl(30d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_last_30d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(30d_ago.order_original_amount,0.0),
nvl(old.order_last_30d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(30d_ago.order_final_amount,0.0),
nvl(old.order_count,0)+nvl(1d_ago.order_count,0),
nvl(old.order_activity_count,0)+nvl(1d_ago.order_activity_count,0),
nvl(old.order_activity_reduce_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_count,0)+nvl(1d_ago.order_coupon_count,0),
nvl(old.order_coupon_reduce_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0),
nvl(old.order_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0),
if(old.payment_date_first is null and 1d_ago.payment_count>0, '2020-06-15', old.payment_date_first),
if(1d_ago.payment_count>0,'2020-06-15',old.payment_date_last),
nvl(1d_ago.payment_count,0),
nvl(1d_ago.payment_amount,0.0),
nvl(old.payment_last_7d_count,0)+nvl(1d_ago.payment_count,0)-nvl(7d_ago.payment_count,0),
nvl(old.payment_last_7d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)-nvl(7d_ago.payment_amount,0.0),
nvl(old.payment_last_30d_count,0)+nvl(1d_ago.payment_count,0)-nvl(30d_ago.payment_count,0),
nvl(old.payment_last_30d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(30d_ago.payment_amount,0.0),
nvl(old.payment_count,0)+nvl(1d_ago.payment_count,0),
nvl(old.payment_amount,0.0)+nvl(1d_ago.payment_amount,0.0),
nvl(1d_ago.refund_order_count,0),
nvl(1d_ago.refund_order_num,0),
nvl(1d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_7d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(7d_ago.refund_order_count,0),
nvl(old.refund_order_last_7d_num,0)+nvl(1d_ago.refund_order_num, 0)- nvl(7d_ago.refund_order_num,0),
nvl(old.refund_order_last_7d_amount,0.0)+ nvl(1d_ago.refund_order_amount,0.0)- nvl(7d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_30d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(30d_ago.refund_order_count,0),
nvl(old.refund_order_last_30d_num,0)+nvl(1d_ago.refund_order_num, 0)- nvl(30d_ago.refund_order_num,0),
nvl(old.refund_order_last_30d_amount,0.0)+ nvl(1d_ago.refund_order_amount,0.0)- nvl(30d_ago.refund_order_amount,0.0),
nvl(old.refund_order_count,0)+nvl(1d_ago.refund_order_count,0),
nvl(old.refund_order_num,0)+nvl(1d_ago.refund_order_num,0),
nvl(old.refund_order_amount,0.0)+ nvl(1d_ago.refund_order_amount,0.0),
nvl(1d_ago.refund_payment_count,0),
nvl(1d_ago.refund_payment_num,0),
nvl(1d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_7d_count,0)+nvl(1d_ago.refund_payment_count,0)-nvl(7d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_7d_num,0)+nvl(1d_ago.refund_payment_num,0)- nvl(7d_ago.refund_payment_num,0),
nvl(old.refund_payment_last_7d_amount,0.0)+ nvl(1d_ago.refund_payment_amount,0.0)- nvl(7d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_30d_count,0)+nvl(1d_ago.refund_payment_count,0)-nvl(30d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_30d_num,0)+nvl(1d_ago.refund_payment_num,0)- nvl(30d_ago.refund_payment_num,0),
nvl(old.refund_payment_last_30d_amount,0.0)+ nvl(1d_ago.refund_payment_amount,0.0)- nvl(30d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_count,0)+nvl(1d_ago.refund_payment_count,0),
nvl(old.refund_payment_num,0)+nvl(1d_ago.refund_payment_num,0),
nvl(old.refund_payment_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0),
nvl(1d_ago.cart_count,0),
nvl(old.cart_last_7d_count,0)+nvl(1d_ago.cart_count,0)-nvl(7d_ago.cart_count,0),
nvl(old.cart_last_30d_count,0)+nvl(1d_ago.cart_count,0)-nvl(30d_ago.cart_count,0),
nvl(old.cart_count,0)+nvl(1d_ago.cart_count,0),
nvl(1d_ago.favor_count,0),
nvl(old.favor_last_7d_count,0)+nvl(1d_ago.favor_count,0)- nvl(7d_ago.favor_count,0),
nvl(old.favor_last_30d_count,0)+nvl(1d_ago.favor_count,0)- nvl(30d_ago.favor_count,0),
nvl(old.favor_count,0)+nvl(1d_ago.favor_count,0),
nvl(1d_ago.coupon_get_count,0),
nvl(1d_ago.coupon_using_count,0),
nvl(1d_ago.coupon_used_count,0),
nvl(old.coupon_last_7d_get_count,0)+nvl(1d_ago.coupon_get_count,0)- nvl(7d_ago.coupon_get_count,0),
nvl(old.coupon_last_7d_using_count,0)+nvl(1d_ago.coupon_using_count,0)- nvl(7d_ago.coupon_using_count,0),
nvl(old.coupon_last_7d_used_count,0)+ nvl(1d_ago.coupon_used_count,0)- nvl(7d_ago.coupon_used_count,0),
nvl(old.coupon_last_30d_get_count,0)+nvl(1d_ago.coupon_get_count,0)- nvl(30d_ago.coupon_get_count,0),
nvl(old.coupon_last_30d_using_count,0)+nvl(1d_ago.coupon_using_count,0)- nvl(30d_ago.coupon_using_count,0),
nvl(old.coupon_last_30d_used_count,0)+ nvl(1d_ago.coupon_used_count,0)- nvl(30d_ago.coupon_used_count,0),
nvl(old.coupon_get_count,0)+nvl(1d_ago.coupon_get_count,0),
nvl(old.coupon_using_count,0)+nvl(1d_ago.coupon_using_count,0),
nvl(old.coupon_used_count,0)+nvl(1d_ago.coupon_used_count,0),
nvl(1d_ago.appraise_good_count,0),
nvl(1d_ago.appraise_mid_count,0),
nvl(1d_ago.appraise_bad_count,0),
nvl(1d_ago.appraise_default_count,0),
nvl(old.appraise_last_7d_good_count,0)+nvl(1d_ago.appraise_good_count,0)- nvl(7d_ago.appraise_good_count,0),
nvl(old.appraise_last_7d_mid_count,0)+nvl(1d_ago.appraise_mid_count,0)-nvl(7d_ago.appraise_mid_count,0),
nvl(old.appraise_last_7d_bad_count,0)+nvl(1d_ago.appraise_bad_count,0)-nvl(7d_ago.appraise_bad_count,0),
nvl(old.appraise_last_7d_default_count,0)+nvl(1d_ago.appraise_default_count,0)-nvl(7d_ago.appraise_default_count,0),
nvl(old.appraise_last_30d_good_count,0)+nvl(1d_ago.appraise_good_count,0)- nvl(30d_ago.appraise_good_count,0),
nvl(old.appraise_last_30d_mid_count,0)+nvl(1d_ago.appraise_mid_count,0)-nvl(30d_ago.appraise_mid_count,0),
nvl(old.appraise_last_30d_bad_count,0)+nvl(1d_ago.appraise_bad_count,0)-nvl(30d_ago.appraise_bad_count,0),
nvl(old.appraise_last_30d_default_count,0)+nvl(1d_ago.appraise_default_count,0)-nvl(30d_ago.appraise_default_count,0),
nvl(old.appraise_good_count,0)+nvl(1d_ago.appraise_good_count,0),
nvl(old.appraise_mid_count,0)+nvl(1d_ago.appraise_mid_count, 0),
nvl(old.appraise_bad_count,0)+nvl(1d_ago.appraise_bad_count,0),
nvl(old.appraise_default_count,0)+nvl(1d_ago.appraise_default_count,0)
from
(
select
user_id,
login_date_first,
login_date_last,
login_date_1d_count,
login_last_1d_day_count,
login_last_7d_count,
login_last_7d_day_count,
login_last_30d_count,
login_last_30d_day_count,
login_count,
login_day_count,
order_date_first,
order_date_last,
order_last_1d_count,
order_activity_last_1d_count,
order_activity_reduce_last_1d_amount,
order_coupon_last_1d_count,
order_coupon_reduce_last_1d_amount,
order_last_1d_original_amount,
order_last_1d_final_amount,
order_last_7d_count,
order_activity_last_7d_count,
order_activity_reduce_last_7d_amount,
order_coupon_last_7d_count,
order_coupon_reduce_last_7d_amount,
order_last_7d_original_amount,
order_last_7d_final_amount,
order_last_30d_count,
order_activity_last_30d_count,
order_activity_reduce_last_30d_amount,
order_coupon_last_30d_count,
order_coupon_reduce_last_30d_amount,
order_last_30d_original_amount,
order_last_30d_final_amount,
order_count,
order_activity_count,
order_activity_reduce_amount,
order_coupon_count,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_date_first,
payment_date_last,
payment_last_1d_count,
payment_last_1d_amount,
payment_last_7d_count,
payment_last_7d_amount,
payment_last_30d_count,
payment_last_30d_amount,
payment_count,
payment_amount,
refund_order_last_1d_count,
refund_order_last_1d_num,
refund_order_last_1d_amount,
refund_order_last_7d_count,
refund_order_last_7d_num,
refund_order_last_7d_amount,
refund_order_last_30d_count,
refund_order_last_30d_num,
refund_order_last_30d_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_last_1d_count,
refund_payment_last_1d_num,
refund_payment_last_1d_amount,
refund_payment_last_7d_count,
refund_payment_last_7d_num,
refund_payment_last_7d_amount,
refund_payment_last_30d_count,
refund_payment_last_30d_num,
refund_payment_last_30d_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
cart_last_1d_count,
cart_last_7d_count,
cart_last_30d_count,
cart_count,
favor_last_1d_count,
favor_last_7d_count,
favor_last_30d_count,
favor_count,
coupon_last_1d_get_count,
coupon_last_1d_using_count,
coupon_last_1d_used_count,
coupon_last_7d_get_count,
coupon_last_7d_using_count,
coupon_last_7d_used_count,
coupon_last_30d_get_count,
coupon_last_30d_using_count,
coupon_last_30d_used_count,
coupon_get_count,
coupon_using_count,
coupon_used_count,
appraise_last_1d_good_count,
appraise_last_1d_mid_count,
appraise_last_1d_bad_count,
appraise_last_1d_default_count,
appraise_last_7d_good_count,
appraise_last_7d_mid_count,
appraise_last_7d_bad_count,
appraise_last_7d_default_count,
appraise_last_30d_good_count,
appraise_last_30d_mid_count,
appraise_last_30d_bad_count,
appraise_last_30d_default_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from dwt_user_topic
where dt=date_add('2020-06-15',-1)
)old
full outer join
(
select
user_id,
login_count,
cart_count,
favor_count,
order_count,
order_activity_count,
order_activity_reduce_amount,
order_coupon_count,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
coupon_get_count,
coupon_using_count,
coupon_used_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from dws_user_action_daycount
where dt='2020-06-15'
)1d_ago
on old.user_id=1d_ago.user_id
left join
(
select
user_id,
login_count,
cart_count,
favor_count,
order_count,
order_activity_count,
order_activity_reduce_amount,
order_coupon_count,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
coupon_get_count,
coupon_using_count,
coupon_used_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from dws_user_action_daycount
where dt=date_add('2020-06-15',-7)
)7d_ago
on old.user_id=7d_ago.user_id
left join
(
select
user_id,
login_count,
cart_count,
favor_count,
order_count,
order_activity_count,
order_activity_reduce_amount,
order_coupon_count,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
coupon_get_count,
coupon_using_count,
coupon_used_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from dws_user_action_daycount
where dt=date_add('2020-06-15',-30)
)30d_ago
on old.user_id=30d_ago.user_id;
3)查詢加載結果
5.3 商品主題寬表
商品主題寬表與會員主題寬表稍有不同,商品的首次被購買時間和末次被購買時間數據沒有太大的意義,重點需要獲得多個事實行為的累計度量值和累計行為次數。
1)建表語句
DROP TABLE IF EXISTS dwt_sku_topic;
CREATE EXTERNAL TABLE dwt_sku_topic
(
`sku_id` STRING COMMENT 'sku_id',
`order_last_1d_count` BIGINT COMMENT '最近1日被下單次數',
`order_last_1d_num` BIGINT COMMENT '最近1日被下單件數',
`order_activity_last_1d_count` BIGINT COMMENT '最近1日參與活動被下單次數',
`order_coupon_last_1d_count` BIGINT COMMENT '最近1日使用優惠券被下單次數',
`order_activity_reduce_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日優惠金額(活動)',
`order_coupon_reduce_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日優惠金額(優惠券)',
`order_last_1d_original_amount` DECIMAL(16,2) COMMENT '最近1日被下單原始金額',
`order_last_1d_final_amount` DECIMAL(16,2) COMMENT '最近1日被下單最終金額',
`order_last_7d_count` BIGINT COMMENT '最近7日被下單次數',
`order_last_7d_num` BIGINT COMMENT '最近7日被下單件數',
`order_activity_last_7d_count` BIGINT COMMENT '最近7日參與活動被下單次數',
`order_coupon_last_7d_count` BIGINT COMMENT '最近7日使用優惠券被下單次數',
`order_activity_reduce_last_7d_amount` DECIMAL(16,2) COMMENT '最近7日優惠金額(活動)',
`order_coupon_reduce_last_7d_amount` DECIMAL(16,2) COMMENT '最近7日優惠金額(優惠券)',
`order_last_7d_original_amount` DECIMAL(16,2) COMMENT '最近7日被下單原始金額',
`order_last_7d_final_amount` DECIMAL(16,2) COMMENT '最近7日被下單最終金額',
`order_last_30d_count` BIGINT COMMENT '最近30日被下單次數',
`order_last_30d_num` BIGINT COMMENT '最近30日被下單件數',
`order_activity_last_30d_count` BIGINT COMMENT '最近30日參與活動被下單次數',
`order_coupon_last_30d_count` BIGINT COMMENT '最近30日使用優惠券被下單次數',
`order_activity_reduce_last_30d_amount` DECIMAL(16,2) COMMENT '最近30日優惠金額(活動)',
`order_coupon_reduce_last_30d_amount` DECIMAL(16,2) COMMENT '最近30日優惠金額(優惠券)',
`order_last_30d_original_amount` DECIMAL(16,2) COMMENT '最近30日被下單原始金額',
`order_last_30d_final_amount` DECIMAL(16,2) COMMENT '最近30日被下單最終金額',
`order_count` BIGINT COMMENT '累積被下單次數',
`order_num` BIGINT COMMENT '累積被下單件數',
`order_activity_count` BIGINT COMMENT '累積參與活動被下單次數',
`order_coupon_count` BIGINT COMMENT '累積使用優惠券被下單次數',
`order_activity_reduce_amount` DECIMAL(16,2) COMMENT '累積優惠金額(活動)',
`order_coupon_reduce_amount` DECIMAL(16,2) COMMENT '累積優惠金額(優惠券)',
`order_original_amount` DECIMAL(16,2) COMMENT '累積被下單原始金額',
`order_final_amount` DECIMAL(16,2) COMMENT '累積被下單最終金額',
`payment_last_1d_count` BIGINT COMMENT '最近1日被支付次數',
`payment_last_1d_num` BIGINT COMMENT '最近1日被支付件數',
`payment_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日被支付金額',
`payment_last_7d_count` BIGINT COMMENT '最近7日被支付次數',
`payment_last_7d_num` BIGINT COMMENT '最近7日被支付件數',
`payment_last_7d_amount` DECIMAL(16,2) COMMENT '最近7日被支付金額',
`payment_last_30d_count` BIGINT COMMENT '最近30日被支付次數',
`payment_last_30d_num` BIGINT COMMENT '最近30日被支付件數',
`payment_last_30d_amount` DECIMAL(16,2) COMMENT '最近30日被支付金額',
`payment_count` BIGINT COMMENT '累積被支付次數',
`payment_num` BIGINT COMMENT '累積被支付件數',
`payment_amount` DECIMAL(16,2) COMMENT '累積被支付金額',
`refund_order_last_1d_count` BIGINT COMMENT '最近1日退單次數',
`refund_order_last_1d_num` BIGINT COMMENT '最近1日退單件數',
`refund_order_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日退單金額',
`refund_order_last_7d_count` BIGINT COMMENT '最近7日退單次數',
`refund_order_last_7d_num` BIGINT COMMENT '最近7日退單件數',
`refund_order_last_7d_amount` DECIMAL(16,2) COMMENT '最近7日退單金額',
`refund_order_last_30d_count` BIGINT COMMENT '最近30日退單次數',
`refund_order_last_30d_num` BIGINT COMMENT '最近30日退單件數',
`refund_order_last_30d_amount` DECIMAL(16,2) COMMENT '最近30日退單金額',
`refund_order_count` BIGINT COMMENT '累積退單次數',
`refund_order_num` BIGINT COMMENT '累積退單件數',
`refund_order_amount` DECIMAL(16,2) COMMENT '累積退單金額',
`refund_payment_last_1d_count` BIGINT COMMENT '最近1日退款次數',
`refund_payment_last_1d_num` BIGINT COMMENT '最近1日退款件數',
`refund_payment_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日退款金額',
`refund_payment_last_7d_count` BIGINT COMMENT '最近7日退款次數',
`refund_payment_last_7d_num` BIGINT COMMENT '最近7日退款件數',
`refund_payment_last_7d_amount` DECIMAL(16,2) COMMENT '最近7日退款金額',
`refund_payment_last_30d_count` BIGINT COMMENT '最近30日退款次數',
`refund_payment_last_30d_num` BIGINT COMMENT '最近30日退款件數',
`refund_payment_last_30d_amount` DECIMAL(16,2) COMMENT '最近30日退款金額',
`refund_payment_count` BIGINT COMMENT '累積退款次數',
`refund_payment_num` BIGINT COMMENT '累積退款件數',
`refund_payment_amount` DECIMAL(16,2) COMMENT '累積退款金額',
`cart_last_1d_count` BIGINT COMMENT '最近1日被加入購物車次數',
`cart_last_7d_count` BIGINT COMMENT '最近7日被加入購物車次數',
`cart_last_30d_count` BIGINT COMMENT '最近30日被加入購物車次數',
`cart_count` BIGINT COMMENT '累積被加入購物車次數',
`favor_last_1d_count` BIGINT COMMENT '最近1日被收藏次數',
`favor_last_7d_count` BIGINT COMMENT '最近7日被收藏次數',
`favor_last_30d_count` BIGINT COMMENT '最近30日被收藏次數',
`favor_count` BIGINT COMMENT '累積被收藏次數',
`appraise_last_1d_good_count` BIGINT COMMENT '最近1日好評數',
`appraise_last_1d_mid_count` BIGINT COMMENT '最近1日中評數',
`appraise_last_1d_bad_count` BIGINT COMMENT '最近1日差評數',
`appraise_last_1d_default_count` BIGINT COMMENT '最近1日默認評價數',
`appraise_last_7d_good_count` BIGINT COMMENT '最近7日好評數',
`appraise_last_7d_mid_count` BIGINT COMMENT '最近7日中評數',
`appraise_last_7d_bad_count` BIGINT COMMENT '最近7日差評數',
`appraise_last_7d_default_count` BIGINT COMMENT '最近7日默認評價數',
`appraise_last_30d_good_count` BIGINT COMMENT '最近30日好評數',
`appraise_last_30d_mid_count` BIGINT COMMENT '最近30日中評數',
`appraise_last_30d_bad_count` BIGINT COMMENT '最近30日差評數',
`appraise_last_30d_default_count` BIGINT COMMENT '最近30日默認評價數',
`appraise_good_count` BIGINT COMMENT '累積好評數',
`appraise_mid_count` BIGINT COMMENT '累積中評數',
`appraise_bad_count` BIGINT COMMENT '累積差評數',
`appraise_default_count` BIGINT COMMENT '累積默認評價數'
)COMMENT '商品主題寬表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwt/dwt_sku_topic/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)數據裝載
(1)首日裝載
insert overwrite table dwt_sku_topic partition(dt='2020-06-14')
select
id,
nvl(order_last_1d_count,0),
nvl(order_last_1d_num,0),
nvl(order_activity_last_1d_count,0),
nvl(order_coupon_last_1d_count,0),
nvl(order_activity_reduce_last_1d_amount,0),
nvl(order_coupon_reduce_last_1d_amount,0),
nvl(order_last_1d_original_amount,0),
nvl(order_last_1d_final_amount,0),
nvl(order_last_7d_count,0),
nvl(order_last_7d_num,0),
nvl(order_activity_last_7d_count,0),
nvl(order_coupon_last_7d_count,0),
nvl(order_activity_reduce_last_7d_amount,0),
nvl(order_coupon_reduce_last_7d_amount,0),
nvl(order_last_7d_original_amount,0),
nvl(order_last_7d_final_amount,0),
nvl(order_last_30d_count,0),
nvl(order_last_30d_num,0),
nvl(order_activity_last_30d_count,0),
nvl(order_coupon_last_30d_count,0),
nvl(order_activity_reduce_last_30d_amount,0),
nvl(order_coupon_reduce_last_30d_amount,0),
nvl(order_last_30d_original_amount,0),
nvl(order_last_30d_final_amount,0),
nvl(order_count,0),
nvl(order_num,0),
nvl(order_activity_count,0),
nvl(order_coupon_count,0),
nvl(order_activity_reduce_amount,0),
nvl(order_coupon_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_last_1d_count,0),
nvl(payment_last_1d_num,0),
nvl(payment_last_1d_amount,0),
nvl(payment_last_7d_count,0),
nvl(payment_last_7d_num,0),
nvl(payment_last_7d_amount,0),
nvl(payment_last_30d_count,0),
nvl(payment_last_30d_num,0),
nvl(payment_last_30d_amount,0),
nvl(payment_count,0),
nvl(payment_num,0),
nvl(payment_amount,0),
nvl(refund_order_last_1d_count,0),
nvl(refund_order_last_1d_num,0),
nvl(refund_order_last_1d_amount,0),
nvl(refund_order_last_7d_count,0),
nvl(refund_order_last_7d_num,0),
nvl(refund_order_last_7d_amount,0),
nvl(refund_order_last_30d_count,0),
nvl(refund_order_last_30d_num,0),
nvl(refund_order_last_30d_amount,0),
nvl(refund_order_count,0),
nvl(refund_order_num,0),
nvl(refund_order_amount,0),
nvl(refund_payment_last_1d_count,0),
nvl(refund_payment_last_1d_num,0),
nvl(refund_payment_last_1d_amount,0),
nvl(refund_payment_last_7d_count,0),
nvl(refund_payment_last_7d_num,0),
nvl(refund_payment_last_7d_amount,0),
nvl(refund_payment_last_30d_count,0),
nvl(refund_payment_last_30d_num,0),
nvl(refund_payment_last_30d_amount,0),
nvl(refund_payment_count,0),
nvl(refund_payment_num,0),
nvl(refund_payment_amount,0),
nvl(cart_last_1d_count,0),
nvl(cart_last_7d_count,0),
nvl(cart_last_30d_count,0),
nvl(cart_count,0),
nvl(favor_last_1d_count,0),
nvl(favor_last_7d_count,0),
nvl(favor_last_30d_count,0),
nvl(favor_count,0),
nvl(appraise_last_1d_good_count,0),
nvl(appraise_last_1d_mid_count,0),
nvl(appraise_last_1d_bad_count,0),
nvl(appraise_last_1d_default_count,0),
nvl(appraise_last_7d_good_count,0),
nvl(appraise_last_7d_mid_count,0),
nvl(appraise_last_7d_bad_count,0),
nvl(appraise_last_7d_default_count,0),
nvl(appraise_last_30d_good_count,0),
nvl(appraise_last_30d_mid_count,0),
nvl(appraise_last_30d_bad_count,0),
nvl(appraise_last_30d_default_count,0),
nvl(appraise_good_count,0),
nvl(appraise_mid_count,0),
nvl(appraise_bad_count,0),
nvl(appraise_default_count,0)
from
(
select
id
from dim_sku_info
where dt='2020-06-14'
)t1
left join
(
select
sku_id,
sum(if(dt='2020-06-14',order_count,0)) order_last_1d_count,
sum(if(dt='2020-06-14',order_num,0)) order_last_1d_num,
sum(if(dt='2020-06-14',order_activity_count,0)) order_activity_last_1d_count,
sum(if(dt='2020-06-14',order_coupon_count,0)) order_coupon_last_1d_count,
sum(if(dt='2020-06-14',order_activity_reduce_amount,0)) order_activity_reduce_last_1d_amount,
sum(if(dt='2020-06-14',order_coupon_reduce_amount,0)) order_coupon_reduce_last_1d_amount,
sum(if(dt='2020-06-14',order_original_amount,0)) order_last_1d_original_amount,
sum(if(dt='2020-06-14',order_final_amount,0)) order_last_1d_final_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_count,0)) order_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),order_num,0)) order_last_7d_num,
sum(if(dt>=date_add('2020-06-14',-6),order_activity_count,0)) order_activity_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),order_coupon_count,0)) order_coupon_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),order_activity_reduce_amount,0)) order_activity_reduce_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_coupon_reduce_amount,0)) order_coupon_reduce_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_original_amount,0)) order_last_7d_original_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_final_amount,0)) order_last_7d_final_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_count,0)) order_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),order_num,0)) order_last_30d_num,
sum(if(dt>=date_add('2020-06-14',-29),order_activity_count,0)) order_activity_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),order_coupon_count,0)) order_coupon_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),order_activity_reduce_amount,0)) order_activity_reduce_last_30d_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_coupon_reduce_amount,0)) order_coupon_reduce_last_30d_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_original_amount,0)) order_last_30d_original_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_final_amount,0)) order_last_30d_final_amount,
sum(order_count) order_count,
sum(order_num) order_num,
sum(order_activity_count) order_activity_count,
sum(order_coupon_count) order_coupon_count,
sum(order_activity_reduce_amount) order_activity_reduce_amount,
sum(order_coupon_reduce_amount) order_coupon_reduce_amount,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
sum(if(dt='2020-06-14',payment_count,0)) payment_last_1d_count,
sum(if(dt='2020-06-14',payment_num,0)) payment_last_1d_num,
sum(if(dt='2020-06-14',payment_amount,0)) payment_last_1d_amount,
sum(if(dt>=date_add('2020-06-14',-6),payment_count,0)) payment_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),payment_num,0)) payment_last_7d_num,
sum(if(dt>=date_add('2020-06-14',-6),payment_amount,0)) payment_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-29),payment_count,0)) payment_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),payment_num,0)) payment_last_30d_num,
sum(if(dt>=date_add('2020-06-14',-29),payment_amount,0)) payment_last_30d_amount,
sum(payment_count) payment_count,
sum(payment_num) payment_num,
sum(payment_amount) payment_amount,
sum(if(dt='2020-06-14',refund_order_count,0)) refund_order_last_1d_count,
sum(if(dt='2020-06-14',refund_order_num,0)) refund_order_last_1d_num,
sum(if(dt='2020-06-14',refund_order_amount,0)) refund_order_last_1d_amount,
sum(if(dt>=date_add('2020-06-14',-6),refund_order_count,0)) refund_order_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),refund_order_num,0)) refund_order_last_7d_num,
sum(if(dt>=date_add('2020-06-14',-6),refund_order_amount,0)) refund_order_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-29),refund_order_count,0)) refund_order_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),refund_order_num,0)) refund_order_last_30d_num,
sum(if(dt>=date_add('2020-06-14',-29),refund_order_amount,0)) refund_order_last_30d_amount,
sum(refund_order_count) refund_order_count,
sum(refund_order_num) refund_order_num,
sum(refund_order_amount) refund_order_amount,
sum(if(dt='2020-06-14',refund_payment_count,0)) refund_payment_last_1d_count,
sum(if(dt='2020-06-14',refund_payment_num,0)) refund_payment_last_1d_num,
sum(if(dt='2020-06-14',refund_payment_amount,0)) refund_payment_last_1d_amount,
sum(if(dt>=date_add('2020-06-14',-6),refund_payment_count,0)) refund_payment_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),refund_payment_num,0)) refund_payment_last_7d_num,
sum(if(dt>=date_add('2020-06-14',-6),refund_payment_amount,0)) refund_payment_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-29),refund_payment_count,0)) refund_payment_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),refund_payment_num,0)) refund_payment_last_30d_num,
sum(if(dt>=date_add('2020-06-14',-29),refund_payment_amount,0)) refund_payment_last_30d_amount,
sum(refund_payment_count) refund_payment_count,
sum(refund_payment_num) refund_payment_num,
sum(refund_payment_amount) refund_payment_amount,
sum(if(dt='2020-06-14',cart_count,0)) cart_last_1d_count,
sum(if(dt>=date_add('2020-06-14',-6),cart_count,0)) cart_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-29),cart_count,0)) cart_last_30d_count,
sum(cart_count) cart_count,
sum(if(dt='2020-06-14',favor_count,0)) favor_last_1d_count,
sum(if(dt>=date_add('2020-06-14',-6),favor_count,0)) favor_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-29),favor_count,0)) favor_last_30d_count,
sum(favor_count) favor_count,
sum(if(dt='2020-06-14',appraise_good_count,0)) appraise_last_1d_good_count,
sum(if(dt='2020-06-14',appraise_mid_count,0)) appraise_last_1d_mid_count,
sum(if(dt='2020-06-14',appraise_bad_count,0)) appraise_last_1d_bad_count,
sum(if(dt='2020-06-14',appraise_default_count,0)) appraise_last_1d_default_count,
sum(if(dt>=date_add('2020-06-14',-6),appraise_good_count,0)) appraise_last_7d_good_count,
sum(if(dt>=date_add('2020-06-14',-6),appraise_mid_count,0)) appraise_last_7d_mid_count,
sum(if(dt>=date_add('2020-06-14',-6),appraise_bad_count,0)) appraise_last_7d_bad_count,
sum(if(dt>=date_add('2020-06-14',-6),appraise_default_count,0)) appraise_last_7d_default_count,
sum(if(dt>=date_add('2020-06-14',-29),appraise_good_count,0)) appraise_last_30d_good_count,
sum(if(dt>=date_add('2020-06-14',-29),appraise_mid_count,0)) appraise_last_30d_mid_count,
sum(if(dt>=date_add('2020-06-14',-29),appraise_bad_count,0)) appraise_last_30d_bad_count,
sum(if(dt>=date_add('2020-06-14',-29),appraise_default_count,0)) appraise_last_30d_default_count,
sum(appraise_good_count) appraise_good_count,
sum(appraise_mid_count) appraise_mid_count,
sum(appraise_bad_count) appraise_bad_count,
sum(appraise_default_count) appraise_default_count
from dws_sku_action_daycount
group by sku_id
)t2
on t1.id=t2.sku_id;
(2)每日裝載
insert overwrite table dwt_sku_topic partition(dt='2020-06-15')
select
nvl(1d_ago.sku_id,old.sku_id),
nvl(1d_ago.order_count,0),
nvl(1d_ago.order_num,0),
nvl(1d_ago.order_activity_count,0),
nvl(1d_ago.order_coupon_count,0),
nvl(1d_ago.order_activity_reduce_amount,0.0),
nvl(1d_ago.order_coupon_reduce_amount,0.0),
nvl(1d_ago.order_original_amount,0.0),
nvl(1d_ago.order_final_amount,0.0),
nvl(old.order_last_7d_count,0)+nvl(1d_ago.order_count,0)- nvl(7d_ago.order_count,0),
nvl(old.order_last_7d_num,0)+nvl(1d_ago.order_num,0)- nvl(7d_ago.order_num,0),
nvl(old.order_activity_last_7d_count,0)+nvl(1d_ago.order_activity_count,0)- nvl(7d_ago.order_activity_count,0),
nvl(old.order_coupon_last_7d_count,0)+nvl(1d_ago.order_coupon_count,0)- nvl(7d_ago.order_coupon_count,0),
nvl(old.order_activity_reduce_last_7d_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0)- nvl(7d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_reduce_last_7d_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0)- nvl(7d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_last_7d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(7d_ago.order_original_amount,0.0),
nvl(old.order_last_7d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(7d_ago.order_final_amount,0.0),
nvl(old.order_last_30d_count,0)+nvl(1d_ago.order_count,0)- nvl(30d_ago.order_count,0),
nvl(old.order_last_30d_num,0)+nvl(1d_ago.order_num,0)- nvl(30d_ago.order_num,0),
nvl(old.order_activity_last_30d_count,0)+nvl(1d_ago.order_activity_count,0)- nvl(30d_ago.order_activity_count,0),
nvl(old.order_coupon_last_30d_count,0)+nvl(1d_ago.order_coupon_count,0)- nvl(30d_ago.order_coupon_count,0),
nvl(old.order_activity_reduce_last_30d_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0)- nvl(30d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_reduce_last_30d_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0)- nvl(30d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_last_30d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(30d_ago.order_original_amount,0.0),
nvl(old.order_last_30d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(30d_ago.order_final_amount,0.0),
nvl(old.order_count,0)+nvl(1d_ago.order_count,0),
nvl(old.order_num,0)+nvl(1d_ago.order_num,0),
nvl(old.order_activity_count,0)+nvl(1d_ago.order_activity_count,0),
nvl(old.order_coupon_count,0)+nvl(1d_ago.order_coupon_count,0),
nvl(old.order_activity_reduce_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_reduce_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0),
nvl(old.order_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0),
nvl(1d_ago.payment_count,0),
nvl(1d_ago.payment_num,0),
nvl(1d_ago.payment_amount,0.0),
nvl(old.payment_last_7d_count,0)+nvl(1d_ago.payment_count,0)- nvl(7d_ago.payment_count,0),
nvl(old.payment_last_7d_num,0)+nvl(1d_ago.payment_num,0)- nvl(7d_ago.payment_num,0),
nvl(old.payment_last_7d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(7d_ago.payment_amount,0.0),
nvl(old.payment_last_30d_count,0)+nvl(1d_ago.payment_count,0)- nvl(30d_ago.payment_count,0),
nvl(old.payment_last_30d_num,0)+nvl(1d_ago.payment_num,0)- nvl(30d_ago.payment_num,0),
nvl(old.payment_last_30d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(30d_ago.payment_amount,0.0),
nvl(old.payment_count,0)+nvl(1d_ago.payment_count,0),
nvl(old.payment_num,0)+nvl(1d_ago.payment_num,0),
nvl(old.payment_amount,0.0)+nvl(1d_ago.payment_amount,0.0),
nvl(old.refund_order_last_1d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(1d_ago.refund_order_count,0),
nvl(old.refund_order_last_1d_num,0)+nvl(1d_ago.refund_order_num,0)- nvl(1d_ago.refund_order_num,0),
nvl(old.refund_order_last_1d_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0)- nvl(1d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_7d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(7d_ago.refund_order_count,0),
nvl(old.refund_order_last_7d_num,0)+nvl(1d_ago.refund_order_num,0)- nvl(7d_ago.refund_order_num,0),
nvl(old.refund_order_last_7d_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0)- nvl(7d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_30d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(30d_ago.refund_order_count,0),
nvl(old.refund_order_last_30d_num,0)+nvl(1d_ago.refund_order_num,0)- nvl(30d_ago.refund_order_num,0),
nvl(old.refund_order_last_30d_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0)- nvl(30d_ago.refund_order_amount,0.0),
nvl(old.refund_order_count,0)+nvl(1d_ago.refund_order_count,0),
nvl(old.refund_order_num,0)+nvl(1d_ago.refund_order_num,0),
nvl(old.refund_order_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0),
nvl(1d_ago.refund_payment_count,0),
nvl(1d_ago.refund_payment_num,0),
nvl(1d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_7d_count,0)+nvl(1d_ago.refund_payment_count,0)- nvl(7d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_7d_num,0)+nvl(1d_ago.refund_payment_num,0)- nvl(7d_ago.refund_payment_num,0),
nvl(old.refund_payment_last_7d_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0)- nvl(7d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_30d_count,0)+nvl(1d_ago.refund_payment_count,0)- nvl(30d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_30d_num,0)+nvl(1d_ago.refund_payment_num,0)- nvl(30d_ago.refund_payment_num,0),
nvl(old.refund_payment_last_30d_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0)- nvl(30d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_count,0)+nvl(1d_ago.refund_payment_count,0),
nvl(old.refund_payment_num,0)+nvl(1d_ago.refund_payment_num,0),
nvl(old.refund_payment_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0),
nvl(1d_ago.cart_count,0),
nvl(old.cart_last_7d_count,0)+nvl(1d_ago.cart_count,0)- nvl(7d_ago.cart_count,0),
nvl(old.cart_last_30d_count,0)+nvl(1d_ago.cart_count,0)- nvl(30d_ago.cart_count,0),
nvl(old.cart_count,0)+nvl(1d_ago.cart_count,0),
nvl(1d_ago.favor_count,0),
nvl(old.favor_last_7d_count,0)+nvl(1d_ago.favor_count,0)- nvl(7d_ago.favor_count,0),
nvl(old.favor_last_30d_count,0)+nvl(1d_ago.favor_count,0)- nvl(30d_ago.favor_count,0),
nvl(old.favor_count,0)+nvl(1d_ago.favor_count,0),
nvl(1d_ago.appraise_good_count,0),
nvl(1d_ago.appraise_mid_count,0),
nvl(1d_ago.appraise_bad_count,0),
nvl(1d_ago.appraise_default_count,0),
nvl(old.appraise_last_7d_good_count,0)+nvl(1d_ago.appraise_good_count,0)- nvl(7d_ago.appraise_good_count,0),
nvl(old.appraise_last_7d_mid_count,0)+nvl(1d_ago.appraise_mid_count,0)- nvl(7d_ago.appraise_mid_count,0),
nvl(old.appraise_last_7d_bad_count,0)+nvl(1d_ago.appraise_bad_count,0)- nvl(7d_ago.appraise_bad_count,0),
nvl(old.appraise_last_7d_default_count,0)+nvl(1d_ago.appraise_default_count,0)- nvl(7d_ago.appraise_default_count,0),
nvl(old.appraise_last_30d_good_count,0)+nvl(1d_ago.appraise_good_count,0)- nvl(30d_ago.appraise_good_count,0),
nvl(old.appraise_last_30d_mid_count,0)+nvl(1d_ago.appraise_mid_count,0)- nvl(30d_ago.appraise_mid_count,0),
nvl(old.appraise_last_30d_bad_count,0)+nvl(1d_ago.appraise_bad_count,0)- nvl(30d_ago.appraise_bad_count,0),
nvl(old.appraise_last_30d_default_count,0)+nvl(1d_ago.appraise_default_count,0)- nvl(30d_ago.appraise_default_count,0),
nvl(old.appraise_good_count,0)+nvl(1d_ago.appraise_good_count,0),
nvl(old.appraise_mid_count,0)+nvl(1d_ago.appraise_mid_count,0),
nvl(old.appraise_bad_count,0)+nvl(1d_ago.appraise_bad_count,0),
nvl(old.appraise_default_count,0)+nvl(1d_ago.appraise_default_count,0)
from
(
select
sku_id,
order_last_1d_count,
order_last_1d_num,
order_activity_last_1d_count,
order_coupon_last_1d_count,
order_activity_reduce_last_1d_amount,
order_coupon_reduce_last_1d_amount,
order_last_1d_original_amount,
order_last_1d_final_amount,
order_last_7d_count,
order_last_7d_num,
order_activity_last_7d_count,
order_coupon_last_7d_count,
order_activity_reduce_last_7d_amount,
order_coupon_reduce_last_7d_amount,
order_last_7d_original_amount,
order_last_7d_final_amount,
order_last_30d_count,
order_last_30d_num,
order_activity_last_30d_count,
order_coupon_last_30d_count,
order_activity_reduce_last_30d_amount,
order_coupon_reduce_last_30d_amount,
order_last_30d_original_amount,
order_last_30d_final_amount,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_last_1d_count,
payment_last_1d_num,
payment_last_1d_amount,
payment_last_7d_count,
payment_last_7d_num,
payment_last_7d_amount,
payment_last_30d_count,
payment_last_30d_num,
payment_last_30d_amount,
payment_count,
payment_num,
payment_amount,
refund_order_last_1d_count,
refund_order_last_1d_num,
refund_order_last_1d_amount,
refund_order_last_7d_count,
refund_order_last_7d_num,
refund_order_last_7d_amount,
refund_order_last_30d_count,
refund_order_last_30d_num,
refund_order_last_30d_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_last_1d_count,
refund_payment_last_1d_num,
refund_payment_last_1d_amount,
refund_payment_last_7d_count,
refund_payment_last_7d_num,
refund_payment_last_7d_amount,
refund_payment_last_30d_count,
refund_payment_last_30d_num,
refund_payment_last_30d_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
cart_last_1d_count,
cart_last_7d_count,
cart_last_30d_count,
cart_count,
favor_last_1d_count,
favor_last_7d_count,
favor_last_30d_count,
favor_count,
appraise_last_1d_good_count,
appraise_last_1d_mid_count,
appraise_last_1d_bad_count,
appraise_last_1d_default_count,
appraise_last_7d_good_count,
appraise_last_7d_mid_count,
appraise_last_7d_bad_count,
appraise_last_7d_default_count,
appraise_last_30d_good_count,
appraise_last_30d_mid_count,
appraise_last_30d_bad_count,
appraise_last_30d_default_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from dwt_sku_topic
where dt=date_add('2020-06-15',-1)
)old
full outer join
(
select
sku_id,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_num,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
cart_count,
favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from dws_sku_action_daycount
where dt='2020-06-15'
)1d_ago
on old.sku_id=1d_ago.sku_id
left join
(
select
sku_id,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_num,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
cart_count,
favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from dws_sku_action_daycount
where dt=date_add('2020-06-15',-7)
)7d_ago
on old.sku_id=7d_ago.sku_id
left join
(
select
sku_id,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_num,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
cart_count,
favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from dws_sku_action_daycount
where dt=date_add('2020-06-15',-30)
)30d_ago
on old.sku_id=30d_ago.sku_id;
3)查詢加載結果
5.4 優惠券主題寬表
優惠券主題寬表主要獲取優惠券的領用、下單使用、支付使用行為的累計發生次數和當日累計次數。
1)建表語句
DROP TABLE IF EXISTS dwt_coupon_topic;
CREATE EXTERNAL TABLE dwt_coupon_topic(
`coupon_id` STRING COMMENT '優惠券ID',
`get_last_1d_count` BIGINT COMMENT '最近1日領取次數',
`get_last_7d_count` BIGINT COMMENT '最近7日領取次數',
`get_last_30d_count` BIGINT COMMENT '最近30日領取次數',
`get_count` BIGINT COMMENT '累積領取次數',
`order_last_1d_count` BIGINT COMMENT '最近1日使用某券下單次數',
`order_last_1d_reduce_amount` DECIMAL(16,2) COMMENT '最近1日使用某券下單優惠金額',
`order_last_1d_original_amount` DECIMAL(16,2) COMMENT '最近1日使用某券下單原始金額',
`order_last_1d_final_amount` DECIMAL(16,2) COMMENT '最近1日使用某券下單最終金額',
`order_last_7d_count` BIGINT COMMENT '最近7日使用某券下單次數',
`order_last_7d_reduce_amount` DECIMAL(16,2) COMMENT '最近7日使用某券下單優惠金額',
`order_last_7d_original_amount` DECIMAL(16,2) COMMENT '最近7日使用某券下單原始金額',
`order_last_7d_final_amount` DECIMAL(16,2) COMMENT '最近7日使用某券下單最終金額',
`order_last_30d_count` BIGINT COMMENT '最近30日使用某券下單次數',
`order_last_30d_reduce_amount` DECIMAL(16,2) COMMENT '最近30日使用某券下單優惠金額',
`order_last_30d_original_amount` DECIMAL(16,2) COMMENT '最近30日使用某券下單原始金額',
`order_last_30d_final_amount` DECIMAL(16,2) COMMENT '最近30日使用某券下單最終金額',
`order_count` BIGINT COMMENT '累積使用(下單)次數',
`order_reduce_amount` DECIMAL(16,2) COMMENT '使用某券累積下單優惠金額',
`order_original_amount` DECIMAL(16,2) COMMENT '使用某券累積下單原始金額',
`order_final_amount` DECIMAL(16,2) COMMENT '使用某券累積下單最終金額',
`payment_last_1d_count` BIGINT COMMENT '最近1日使用某券支付次數',
`payment_last_1d_reduce_amount` DECIMAL(16,2) COMMENT '最近1日使用某券優惠金額',
`payment_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日使用某券支付金額',
`payment_last_7d_count` BIGINT COMMENT '最近7日使用某券支付次數',
`payment_last_7d_reduce_amount` DECIMAL(16,2) COMMENT '最近7日使用某券優惠金額',
`payment_last_7d_amount` DECIMAL(16,2) COMMENT '最近7日使用某券支付金額',
`payment_last_30d_count` BIGINT COMMENT '最近30日使用某券支付次數',
`payment_last_30d_reduce_amount` DECIMAL(16,2) COMMENT '最近30日使用某券優惠金額',
`payment_last_30d_amount` DECIMAL(16,2) COMMENT '最近30日使用某券支付金額',
`payment_count` BIGINT COMMENT '累積使用(支付)次數',
`payment_reduce_amount` DECIMAL(16,2) COMMENT '使用某券累積優惠金額',
`payment_amount` DECIMAL(16,2) COMMENT '使用某券累積支付金額',
`expire_last_1d_count` BIGINT COMMENT '最近1日過期次數',
`expire_last_7d_count` BIGINT COMMENT '最近7日過期次數',
`expire_last_30d_count` BIGINT COMMENT '最近30日過期次數',
`expire_count` BIGINT COMMENT '累積過期次數'
)comment '優惠券主題表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwt/dwt_coupon_topic/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)數據裝載
(1)首日裝載
insert overwrite table dwt_coupon_topic partition(dt='2020-06-14')
select
id,
nvl(get_last_1d_count,0),
nvl(get_last_7d_count,0),
nvl(get_last_30d_count,0),
nvl(get_count,0),
nvl(order_last_1d_count,0),
nvl(order_last_1d_reduce_amount,0),
nvl(order_last_1d_original_amount,0),
nvl(order_last_1d_final_amount,0),
nvl(order_last_7d_count,0),
nvl(order_last_7d_reduce_amount,0),
nvl(order_last_7d_original_amount,0),
nvl(order_last_7d_final_amount,0),
nvl(order_last_30d_count,0),
nvl(order_last_30d_reduce_amount,0),
nvl(order_last_30d_original_amount,0),
nvl(order_last_30d_final_amount,0),
nvl(order_count,0),
nvl(order_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_last_1d_count,0),
nvl(payment_last_1d_reduce_amount,0),
nvl(payment_last_1d_amount,0),
nvl(payment_last_7d_count,0),
nvl(payment_last_7d_reduce_amount,0),
nvl(payment_last_7d_amount,0),
nvl(payment_last_30d_count,0),
nvl(payment_last_30d_reduce_amount,0),
nvl(payment_last_30d_amount,0),
nvl(payment_count,0),
nvl(payment_reduce_amount,0),
nvl(payment_amount,0),
nvl(expire_last_1d_count,0),
nvl(expire_last_7d_count,0),
nvl(expire_last_30d_count,0),
nvl(expire_count,0)
from
(
select
id
from dim_coupon_info
where dt='2020-06-14'
)t1
left join
(
select
coupon_id coupon_id,
sum(if(dt='2020-06-14',get_count,0)) get_last_1d_count,
sum(if(dt>=date_add('2020-06-14',-6),get_count,0)) get_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-29),get_count,0)) get_last_30d_count,
sum(get_count) get_count,
sum(if(dt='2020-06-14',order_count,0)) order_last_1d_count,
sum(if(dt='2020-06-14',order_reduce_amount,0)) order_last_1d_reduce_amount,
sum(if(dt='2020-06-14',order_original_amount,0)) order_last_1d_original_amount,
sum(if(dt='2020-06-14',order_final_amount,0)) order_last_1d_final_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_count,0)) order_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),order_reduce_amount,0)) order_last_7d_reduce_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_original_amount,0)) order_last_7d_original_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_final_amount,0)) order_last_7d_final_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_count,0)) order_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),order_reduce_amount,0)) order_last_30d_reduce_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_original_amount,0)) order_last_30d_original_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_final_amount,0)) order_last_30d_final_amount,
sum(order_count) order_count,
sum(order_reduce_amount) order_reduce_amount,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
sum(if(dt='2020-06-14',payment_count,0)) payment_last_1d_count,
sum(if(dt='2020-06-14',payment_reduce_amount,0)) payment_last_1d_reduce_amount,
sum(if(dt='2020-06-14',payment_amount,0)) payment_last_1d_amount,
sum(if(dt>=date_add('2020-06-14',-6),payment_count,0)) payment_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),payment_reduce_amount,0)) payment_last_7d_reduce_amount,
sum(if(dt>=date_add('2020-06-14',-6),payment_amount,0)) payment_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-29),payment_count,0)) payment_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),payment_reduce_amount,0)) payment_last_30d_reduce_amount,
sum(if(dt>=date_add('2020-06-14',-29),payment_amount,0)) payment_last_30d_amount,
sum(payment_count) payment_count,
sum(payment_reduce_amount) payment_reduce_amount,
sum(payment_amount) payment_amount,
sum(if(dt='2020-06-14',expire_count,0)) expire_last_1d_count,
sum(if(dt>=date_add('2020-06-14',-6),expire_count,0)) expire_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-29),expire_count,0)) expire_last_30d_count,
sum(expire_count) expire_count
from dws_coupon_info_daycount
group by coupon_id
)t2
on t1.id=t2.coupon_id;
(2)每日裝載
insert overwrite table dwt_coupon_topic partition(dt='2020-06-15')
select
nvl(1d_ago.coupon_id,old.coupon_id),
nvl(1d_ago.get_count,0),
nvl(old.get_last_7d_count,0)+nvl(1d_ago.get_count,0)- nvl(7d_ago.get_count,0),
nvl(old.get_last_30d_count,0)+nvl(1d_ago.get_count,0)- nvl(30d_ago.get_count,0),
nvl(old.get_count,0)+nvl(1d_ago.get_count,0),
nvl(1d_ago.order_count,0),
nvl(1d_ago.order_reduce_amount,0.0),
nvl(1d_ago.order_original_amount,0.0),
nvl(1d_ago.order_final_amount,0.0),
nvl(old.order_last_7d_count,0)+nvl(1d_ago.order_count,0)- nvl(7d_ago.order_count,0),
nvl(old.order_last_7d_reduce_amount,0.0)+nvl(1d_ago.order_reduce_amount,0.0)- nvl(7d_ago.order_reduce_amount,0.0),
nvl(old.order_last_7d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(7d_ago.order_original_amount,0.0),
nvl(old.order_last_7d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(7d_ago.order_final_amount,0.0),
nvl(old.order_last_30d_count,0)+nvl(1d_ago.order_count,0)- nvl(30d_ago.order_count,0),
nvl(old.order_last_30d_reduce_amount,0.0)+nvl(1d_ago.order_reduce_amount,0.0)- nvl(30d_ago.order_reduce_amount,0.0),
nvl(old.order_last_30d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(30d_ago.order_original_amount,0.0),
nvl(old.order_last_30d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(30d_ago.order_final_amount,0.0),
nvl(old.order_count,0)+nvl(1d_ago.order_count,0),
nvl(old.order_reduce_amount,0.0)+nvl(1d_ago.order_reduce_amount,0.0),
nvl(old.order_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0),
nvl(old.order_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0),
nvl(old.payment_last_1d_count,0)+nvl(1d_ago.payment_count,0)- nvl(1d_ago.payment_count,0),
nvl(old.payment_last_1d_reduce_amount,0.0)+nvl(1d_ago.payment_reduce_amount,0.0)- nvl(1d_ago.payment_reduce_amount,0.0),
nvl(old.payment_last_1d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(1d_ago.payment_amount,0.0),
nvl(old.payment_last_7d_count,0)+nvl(1d_ago.payment_count,0)- nvl(7d_ago.payment_count,0),
nvl(old.payment_last_7d_reduce_amount,0.0)+nvl(1d_ago.payment_reduce_amount,0.0)- nvl(7d_ago.payment_reduce_amount,0.0),
nvl(old.payment_last_7d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(7d_ago.payment_amount,0.0),
nvl(old.payment_last_30d_count,0)+nvl(1d_ago.payment_count,0)- nvl(30d_ago.payment_count,0),
nvl(old.payment_last_30d_reduce_amount,0.0)+nvl(1d_ago.payment_reduce_amount,0.0)- nvl(30d_ago.payment_reduce_amount,0.0),
nvl(old.payment_last_30d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(30d_ago.payment_amount,0.0),
nvl(old.payment_count,0)+nvl(1d_ago.payment_count,0),
nvl(old.payment_reduce_amount,0.0)+nvl(1d_ago.payment_reduce_amount,0.0),
nvl(old.payment_amount,0.0)+nvl(1d_ago.payment_amount,0.0),
nvl(1d_ago.expire_count,0),
nvl(old.expire_last_7d_count,0)+nvl(1d_ago.expire_count,0)- nvl(7d_ago.expire_count,0),
nvl(old.expire_last_30d_count,0)+nvl(1d_ago.expire_count,0)- nvl(30d_ago.expire_count,0),
nvl(old.expire_count,0)+nvl(1d_ago.expire_count,0)
from
(
select
coupon_id,
get_last_1d_count,
get_last_7d_count,
get_last_30d_count,
get_count,
order_last_1d_count,
order_last_1d_reduce_amount,
order_last_1d_original_amount,
order_last_1d_final_amount,
order_last_7d_count,
order_last_7d_reduce_amount,
order_last_7d_original_amount,
order_last_7d_final_amount,
order_last_30d_count,
order_last_30d_reduce_amount,
order_last_30d_original_amount,
order_last_30d_final_amount,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_last_1d_count,
payment_last_1d_reduce_amount,
payment_last_1d_amount,
payment_last_7d_count,
payment_last_7d_reduce_amount,
payment_last_7d_amount,
payment_last_30d_count,
payment_last_30d_reduce_amount,
payment_last_30d_amount,
payment_count,
payment_reduce_amount,
payment_amount,
expire_last_1d_count,
expire_last_7d_count,
expire_last_30d_count,
expire_count
from dwt_coupon_topic
where dt=date_add('2020-06-15',-1)
)old
full outer join
(
select
coupon_id,
get_count,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount,
expire_count
from dws_coupon_info_daycount
where dt='2020-06-15'
)1d_ago
on old.coupon_id=1d_ago.coupon_id
left join
(
select
coupon_id,
get_count,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount,
expire_count
from dws_coupon_info_daycount
where dt=date_add('2020-06-15',-7)
)7d_ago
on old.coupon_id=7d_ago.coupon_id
left join
(
select
coupon_id,
get_count,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount,
expire_count
from dws_coupon_info_daycount
where dt=date_add('2020-06-15',-30)
)30d_ago
on old.coupon_id=30d_ago.coupon_id;
3)查詢加載結果
5.5 活動主題寬表
活動主題寬表與優惠券主題寬表類似,主要獲取下單、支付行為的當日行為次數和累計行為次數。
1)建表語句
DROP TABLE IF EXISTS dwt_activity_topic;
CREATE EXTERNAL TABLE dwt_activity_topic(
`activity_rule_id` STRING COMMENT '活動規則ID',
`activity_id` STRING COMMENT '活動ID',
`order_last_1d_count` BIGINT COMMENT '最近1日參與某活動某規則下單次數',
`order_last_1d_reduce_amount` DECIMAL(16,2) COMMENT '最近1日參與某活動某規則下單優惠金額',
`order_last_1d_original_amount` DECIMAL(16,2) COMMENT '最近1日參與某活動某規則下單原始金額',
`order_last_1d_final_amount` DECIMAL(16,2) COMMENT '最近1日參與某活動某規則下單最終金額',
`order_count` BIGINT COMMENT '參與某活動某規則累積下單次數',
`order_reduce_amount` DECIMAL(16,2) COMMENT '參與某活動某規則累積下單優惠金額',
`order_original_amount` DECIMAL(16,2) COMMENT '參與某活動某規則累積下單原始金額',
`order_final_amount` DECIMAL(16,2) COMMENT '參與某活動某規則累積下單最終金額',
`payment_last_1d_count` BIGINT COMMENT '最近1日參與某活動某規則支付次數',
`payment_last_1d_reduce_amount` DECIMAL(16,2) COMMENT '最近1日參與某活動某規則支付優惠金額',
`payment_last_1d_amount` DECIMAL(16,2) COMMENT '最近1日參與某活動某規則支付金額',
`payment_count` BIGINT COMMENT '參與某活動某規則累積支付次數',
`payment_reduce_amount` DECIMAL(16,2) COMMENT '參與某活動某規則累積支付優惠金額',
`payment_amount` DECIMAL(16,2) COMMENT '參與某活動某規則累積支付金額'
) COMMENT '活動主題寬表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwt/dwt_activity_topic/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)數據裝載
(1)首日裝載
insert overwrite table dwt_activity_topic partition(dt='2020-06-14')
select
t1.activity_rule_id,
t1.activity_id,
nvl(order_last_1d_count,0),
nvl(order_last_1d_reduce_amount,0),
nvl(order_last_1d_original_amount,0),
nvl(order_last_1d_final_amount,0),
nvl(order_count,0),
nvl(order_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_last_1d_count,0),
nvl(payment_last_1d_reduce_amount,0),
nvl(payment_last_1d_amount,0),
nvl(payment_count,0),
nvl(payment_reduce_amount,0),
nvl(payment_amount,0)
from
(
select
activity_rule_id,
activity_id
from dim_activity_rule_info
where dt='2020-06-14'
)t1
left join
(
select
activity_rule_id,
activity_id,
sum(if(dt='2020-06-14',order_count,0)) order_last_1d_count,
sum(if(dt='2020-06-14',order_reduce_amount,0)) order_last_1d_reduce_amount,
sum(if(dt='2020-06-14',order_original_amount,0)) order_last_1d_original_amount,
sum(if(dt='2020-06-14',order_final_amount,0)) order_last_1d_final_amount,
sum(order_count) order_count,
sum(order_reduce_amount) order_reduce_amount,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
sum(if(dt='2020-06-14',payment_count,0)) payment_last_1d_count,
sum(if(dt='2020-06-14',payment_reduce_amount,0)) payment_last_1d_reduce_amount,
sum(if(dt='2020-06-14',payment_amount,0)) payment_last_1d_amount,
sum(payment_count) payment_count,
sum(payment_reduce_amount) payment_reduce_amount,
sum(payment_amount) payment_amount
from dws_activity_info_daycount
group by activity_rule_id,activity_id
)t2
on t1.activity_rule_id=t2.activity_rule_id
and t1.activity_id=t2.activity_id;
(2)每日裝載
insert overwrite table dwt_activity_topic partition(dt='2020-06-15')
select
nvl(1d_ago.activity_rule_id,old.activity_rule_id),
nvl(1d_ago.activity_id,old.activity_id),
nvl(1d_ago.order_count,0),
nvl(1d_ago.order_reduce_amount,0.0),
nvl(1d_ago.order_original_amount,0.0),
nvl(1d_ago.order_final_amount,0.0),
nvl(old.order_count,0)+nvl(1d_ago.order_count,0),
nvl(old.order_reduce_amount,0.0)+nvl(1d_ago.order_reduce_amount,0.0),
nvl(old.order_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0),
nvl(old.order_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0),
nvl(1d_ago.payment_count,0),
nvl(1d_ago.payment_reduce_amount,0.0),
nvl(1d_ago.payment_amount,0.0),
nvl(old.payment_count,0)+nvl(1d_ago.payment_count,0),
nvl(old.payment_reduce_amount,0.0)+nvl(1d_ago.payment_reduce_amount,0.0),
nvl(old.payment_amount,0.0)+nvl(1d_ago.payment_amount,0.0)
from
(
select
activity_rule_id,
activity_id,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount
from dwt_activity_topic
where dt=date_add('2020-06-15',-1)
)old
full outer join
(
select
activity_rule_id,
activity_id,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount
from dws_activity_info_daycount
where dt='2020-06-15'
)1d_ago
on old.activity_rule_id=1d_ago.activity_rule_id;
3)查詢加載結果
5.6 地區主題
1)建表語句
DROP TABLE IF EXISTS dwt_area_topic;
CREATE EXTERNAL TABLE dwt_area_topic(
`province_id` STRING COMMENT '編號',
`visit_last_1d_count` BIGINT COMMENT '最近1日訪客訪問次數',
`login_last_1d_count` BIGINT COMMENT '最近1日用戶訪問次數',
`visit_last_7d_count` BIGINT COMMENT '最近7訪客訪問次數',
`login_last_7d_count` BIGINT COMMENT '最近7日用戶訪問次數',
`visit_last_30d_count` BIGINT COMMENT '最近30日訪客訪問次數',
`login_last_30d_count` BIGINT COMMENT '最近30日用戶訪問次數',
`visit_count` BIGINT COMMENT '累積訪客訪問次數',
`login_count` BIGINT COMMENT '累積用戶訪問次數',
`order_last_1d_count` BIGINT COMMENT '最近1天下單次數',
`order_last_1d_original_amount` DECIMAL(16,2) COMMENT '最近1天下單原始金額',
`order_last_1d_final_amount` DECIMAL(16,2) COMMENT '最近1天下單最終金額',
`order_last_7d_count` BIGINT COMMENT '最近7天下單次數',
`order_last_7d_original_amount` DECIMAL(16,2) COMMENT '最近7天下單原始金額',
`order_last_7d_final_amount` DECIMAL(16,2) COMMENT '最近7天下單最終金額',
`order_last_30d_count` BIGINT COMMENT '最近30天下單次數',
`order_last_30d_original_amount` DECIMAL(16,2) COMMENT '最近30天下單原始金額',
`order_last_30d_final_amount` DECIMAL(16,2) COMMENT '最近30天下單最終金額',
`order_count` BIGINT COMMENT '累積下單次數',
`order_original_amount` DECIMAL(16,2) COMMENT '累積下單原始金額',
`order_final_amount` DECIMAL(16,2) COMMENT '累積下單最終金額',
`payment_last_1d_count` BIGINT COMMENT '最近1天支付次數',
`payment_last_1d_amount` DECIMAL(16,2) COMMENT '最近1天支付金額',
`payment_last_7d_count` BIGINT COMMENT '最近7天支付次數',
`payment_last_7d_amount` DECIMAL(16,2) COMMENT '最近7天支付金額',
`payment_last_30d_count` BIGINT COMMENT '最近30天支付次數',
`payment_last_30d_amount` DECIMAL(16,2) COMMENT '最近30天支付金額',
`payment_count` BIGINT COMMENT '累積支付次數',
`payment_amount` DECIMAL(16,2) COMMENT '累積支付金額',
`refund_order_last_1d_count` BIGINT COMMENT '最近1天退單次數',
`refund_order_last_1d_amount` DECIMAL(16,2) COMMENT '最近1天退單金額',
`refund_order_last_7d_count` BIGINT COMMENT '最近7天退單次數',
`refund_order_last_7d_amount` DECIMAL(16,2) COMMENT '最近7天退單金額',
`refund_order_last_30d_count` BIGINT COMMENT '最近30天退單次數',
`refund_order_last_30d_amount` DECIMAL(16,2) COMMENT '最近30天退單金額',
`refund_order_count` BIGINT COMMENT '累積退單次數',
`refund_order_amount` DECIMAL(16,2) COMMENT '累積退單金額',
`refund_payment_last_1d_count` BIGINT COMMENT '最近1天退款次數',
`refund_payment_last_1d_amount` DECIMAL(16,2) COMMENT '最近1天退款金額',
`refund_payment_last_7d_count` BIGINT COMMENT '最近7天退款次數',
`refund_payment_last_7d_amount` DECIMAL(16,2) COMMENT '最近7天退款金額',
`refund_payment_last_30d_count` BIGINT COMMENT '最近30天退款次數',
`refund_payment_last_30d_amount` DECIMAL(16,2) COMMENT '最近30天退款金額',
`refund_payment_count` BIGINT COMMENT '累積退款次數',
`refund_payment_amount` DECIMAL(16,2) COMMENT '累積退款金額'
) COMMENT '地區主題寬表'
PARTITIONED BY (`dt` STRING)
STORED AS PARQUET
LOCATION '/warehouse/gmall/dwt/dwt_area_topic/'
TBLPROPERTIES ("parquet.compression"="lzo");
2)數據裝載
(1)首日裝載
insert overwrite table dwt_area_topic partition(dt='2020-06-14')
select
id,
nvl(visit_last_1d_count,0),
nvl(login_last_1d_count,0),
nvl(visit_last_7d_count,0),
nvl(login_last_7d_count,0),
nvl(visit_last_30d_count,0),
nvl(login_last_30d_count,0),
nvl(visit_count,0),
nvl(login_count,0),
nvl(order_last_1d_count,0),
nvl(order_last_1d_original_amount,0),
nvl(order_last_1d_final_amount,0),
nvl(order_last_7d_count,0),
nvl(order_last_7d_original_amount,0),
nvl(order_last_7d_final_amount,0),
nvl(order_last_30d_count,0),
nvl(order_last_30d_original_amount,0),
nvl(order_last_30d_final_amount,0),
nvl(order_count,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_last_1d_count,0),
nvl(payment_last_1d_amount,0),
nvl(payment_last_7d_count,0),
nvl(payment_last_7d_amount,0),
nvl(payment_last_30d_count,0),
nvl(payment_last_30d_amount,0),
nvl(payment_count,0),
nvl(payment_amount,0),
nvl(refund_order_last_1d_count,0),
nvl(refund_order_last_1d_amount,0),
nvl(refund_order_last_7d_count,0),
nvl(refund_order_last_7d_amount,0),
nvl(refund_order_last_30d_count,0),
nvl(refund_order_last_30d_amount,0),
nvl(refund_order_count,0),
nvl(refund_order_amount,0),
nvl(refund_payment_last_1d_count,0),
nvl(refund_payment_last_1d_amount,0),
nvl(refund_payment_last_7d_count,0),
nvl(refund_payment_last_7d_amount,0),
nvl(refund_payment_last_30d_count,0),
nvl(refund_payment_last_30d_amount,0),
nvl(refund_payment_count,0),
nvl(refund_payment_amount,0)
from
(
select
id
from dim_base_province
)t1
left join
(
select
province_id province_id,
sum(if(dt='2020-06-14',visit_count,0)) visit_last_1d_count,
sum(if(dt='2020-06-14',login_count,0)) login_last_1d_count,
sum(if(dt>=date_add('2020-06-14',-6),visit_count,0)) visit_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),login_count,0)) login_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-29),visit_count,0)) visit_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),login_count,0)) login_last_30d_count,
sum(visit_count) visit_count,
sum(login_count) login_count,
sum(if(dt='2020-06-14',order_count,0)) order_last_1d_count,
sum(if(dt='2020-06-14',order_original_amount,0)) order_last_1d_original_amount,
sum(if(dt='2020-06-14',order_final_amount,0)) order_last_1d_final_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_count,0)) order_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),order_original_amount,0)) order_last_7d_original_amount,
sum(if(dt>=date_add('2020-06-14',-6),order_final_amount,0)) order_last_7d_final_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_count,0)) order_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),order_original_amount,0)) order_last_30d_original_amount,
sum(if(dt>=date_add('2020-06-14',-29),order_final_amount,0)) order_last_30d_final_amount,
sum(order_count) order_count,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
sum(if(dt='2020-06-14',payment_count,0)) payment_last_1d_count,
sum(if(dt='2020-06-14',payment_amount,0)) payment_last_1d_amount,
sum(if(dt>=date_add('2020-06-14',-6),payment_count,0)) payment_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),payment_amount,0)) payment_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-29),payment_count,0)) payment_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),payment_amount,0)) payment_last_30d_amount,
sum(payment_count) payment_count,
sum(payment_amount) payment_amount,
sum(if(dt='2020-06-14',refund_order_count,0)) refund_order_last_1d_count,
sum(if(dt='2020-06-14',refund_order_amount,0)) refund_order_last_1d_amount,
sum(if(dt>=date_add('2020-06-14',-6),refund_order_count,0)) refund_order_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),refund_order_amount,0)) refund_order_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-29),refund_order_count,0)) refund_order_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),refund_order_amount,0)) refund_order_last_30d_amount,
sum(refund_order_count) refund_order_count,
sum(refund_order_amount) refund_order_amount,
sum(if(dt='2020-06-14',refund_payment_count,0)) refund_payment_last_1d_count,
sum(if(dt='2020-06-14',refund_payment_amount,0)) refund_payment_last_1d_amount,
sum(if(dt>=date_add('2020-06-14',-6),refund_payment_count,0)) refund_payment_last_7d_count,
sum(if(dt>=date_add('2020-06-14',-6),refund_payment_amount,0)) refund_payment_last_7d_amount,
sum(if(dt>=date_add('2020-06-14',-29),refund_payment_count,0)) refund_payment_last_30d_count,
sum(if(dt>=date_add('2020-06-14',-29),refund_payment_amount,0)) refund_payment_last_30d_amount,
sum(refund_payment_count) refund_payment_count,
sum(refund_payment_amount) refund_payment_amount
from dws_area_stats_daycount
group by province_id
)t2
on t1.id=t2.province_id;
(2)每日裝載
insert overwrite table dwt_area_topic partition(dt='2020-06-15')
select
nvl(old.province_id, 1d_ago.province_id),
nvl(1d_ago.visit_count,0),
nvl(1d_ago.login_count,0),
nvl(old.visit_last_7d_count,0)+nvl(1d_ago.visit_count,0)- nvl(7d_ago.visit_count,0),
nvl(old.login_last_7d_count,0)+nvl(1d_ago.login_count,0)- nvl(7d_ago.login_count,0),
nvl(old.visit_last_30d_count,0)+nvl(1d_ago.visit_count,0)- nvl(30d_ago.visit_count,0),
nvl(old.login_last_30d_count,0)+nvl(1d_ago.login_count,0)- nvl(30d_ago.login_count,0),
nvl(old.visit_count,0)+nvl(1d_ago.visit_count,0),
nvl(old.login_count,0)+nvl(1d_ago.login_count,0),
nvl(1d_ago.order_count,0),
nvl(1d_ago.order_original_amount,0.0),
nvl(1d_ago.order_final_amount,0.0),
nvl(old.order_last_7d_count,0)+nvl(1d_ago.order_count,0)- nvl(7d_ago.order_count,0),
nvl(old.order_last_7d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(7d_ago.order_original_amount,0.0),
nvl(old.order_last_7d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(7d_ago.order_final_amount,0.0),
nvl(old.order_last_30d_count,0)+nvl(1d_ago.order_count,0)- nvl(30d_ago.order_count,0),
nvl(old.order_last_30d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(30d_ago.order_original_amount,0.0),
nvl(old.order_last_30d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(30d_ago.order_final_amount,0.0),
nvl(old.order_count,0)+nvl(1d_ago.order_count,0),
nvl(old.order_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0),
nvl(old.order_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0),
nvl(1d_ago.payment_count,0),
nvl(1d_ago.payment_amount,0.0),
nvl(old.payment_last_7d_count,0)+nvl(1d_ago.payment_count,0)- nvl(7d_ago.payment_count,0),
nvl(old.payment_last_7d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(7d_ago.payment_amount,0.0),
nvl(old.payment_last_30d_count,0)+nvl(1d_ago.payment_count,0)- nvl(30d_ago.payment_count,0),
nvl(old.payment_last_30d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(30d_ago.payment_amount,0.0),
nvl(old.payment_count,0)+nvl(1d_ago.payment_count,0),
nvl(old.payment_amount,0.0)+nvl(1d_ago.payment_amount,0.0),
nvl(1d_ago.refund_order_count,0),
nvl(1d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_7d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(7d_ago.refund_order_count,0),
nvl(old.refund_order_last_7d_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0)- nvl(7d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_30d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(30d_ago.refund_order_count,0),
nvl(old.refund_order_last_30d_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0)- nvl(30d_ago.refund_order_amount,0.0),
nvl(old.refund_order_count,0)+nvl(1d_ago.refund_order_count,0),
nvl(old.refund_order_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0),
nvl(1d_ago.refund_payment_count,0),
nvl(1d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_7d_count,0)+nvl(1d_ago.refund_payment_count,0)- nvl(7d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_7d_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0)- nvl(7d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_30d_count,0)+nvl(1d_ago.refund_payment_count,0)- nvl(30d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_30d_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0)- nvl(30d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_count,0)+nvl(1d_ago.refund_payment_count,0),
nvl(old.refund_payment_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0)
from
(
select
province_id,
visit_last_1d_count,
login_last_1d_count,
visit_last_7d_count,
login_last_7d_count,
visit_last_30d_count,
login_last_30d_count,
visit_count,
login_count,
order_last_1d_count,
order_last_1d_original_amount,
order_last_1d_final_amount,
order_last_7d_count,
order_last_7d_original_amount,
order_last_7d_final_amount,
order_last_30d_count,
order_last_30d_original_amount,
order_last_30d_final_amount,
order_count,
order_original_amount,
order_final_amount,
payment_last_1d_count,
payment_last_1d_amount,
payment_last_7d_count,
payment_last_7d_amount,
payment_last_30d_count,
payment_last_30d_amount,
payment_count,
payment_amount,
refund_order_last_1d_count,
refund_order_last_1d_amount,
refund_order_last_7d_count,
refund_order_last_7d_amount,
refund_order_last_30d_count,
refund_order_last_30d_amount,
refund_order_count,
refund_order_amount,
refund_payment_last_1d_count,
refund_payment_last_1d_amount,
refund_payment_last_7d_count,
refund_payment_last_7d_amount,
refund_payment_last_30d_count,
refund_payment_last_30d_amount,
refund_payment_count,
refund_payment_amount
from dwt_area_topic
where dt=date_add('2020-06-15',-1)
)old
full outer join
(
select
province_id,
visit_count,
login_count,
order_count,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_amount,
refund_payment_count,
refund_payment_amount
from dws_area_stats_daycount
where dt='2020-06-15'
)1d_ago
on old.province_id=1d_ago.province_id
left join
(
select
province_id,
visit_count,
login_count,
order_count,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_amount,
refund_payment_count,
refund_payment_amount
from dws_area_stats_daycount
where dt=date_add('2020-06-15',-7)
)7d_ago
on old.province_id= 7d_ago.province_id
left join
(
select
province_id,
visit_count,
login_count,
order_count,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_amount,
refund_payment_count,
refund_payment_amount
from dws_area_stats_daycount
where dt=date_add('2020-06-15',-30)
)30d_ago
on old.province_id= 30d_ago.province_id;
3)查詢加載結果
5.7 DWT層首日數據導入腳本
)編寫腳本
(1)在/home/atguigu/bin目錄下創建腳本dws_to_dwt_init.sh
[atguigu@hadoop102 bin]$ vim dws_to_dwt_init.sh
在腳本中填寫如下內容
#!/bin/bash
APP=gmall
if [ -n "$2" ] ;then
do_date=$2
else
echo "請傳入日期參數"
exit
fi
dwt_visitor_topic="
insert overwrite table ${APP}.dwt_visitor_topic partition(dt='$do_date')
select
nvl(1d_ago.mid_id,old.mid_id),
nvl(1d_ago.brand,old.brand),
nvl(1d_ago.model,old.model),
nvl(1d_ago.channel,old.channel),
nvl(1d_ago.os,old.os),
nvl(1d_ago.area_code,old.area_code),
nvl(1d_ago.version_code,old.version_code),
case when old.mid_id is null and 1d_ago.is_new=1 then '$do_date'
when old.mid_id is null and 1d_ago.is_new=0 then '2020-06-13'--無法獲取准確的首次登錄日期,給定一個數倉搭建日之前的日期
else old.visit_date_first end,
if(1d_ago.mid_id is not null,'$do_date',old.visit_date_last),
nvl(1d_ago.visit_count,0),
if(1d_ago.mid_id is null,0,1),
nvl(old.visit_last_7d_count,0)+nvl(1d_ago.visit_count,0)- nvl(7d_ago.visit_count,0),
nvl(old.visit_last_7d_day_count,0)+if(1d_ago.mid_id is null,0,1)- if(7d_ago.mid_id is null,0,1),
nvl(old.visit_last_30d_count,0)+nvl(1d_ago.visit_count,0)- nvl(30d_ago.visit_count,0),
nvl(old.visit_last_30d_day_count,0)+if(1d_ago.mid_id is null,0,1)- if(30d_ago.mid_id is null,0,1),
nvl(old.visit_count,0)+nvl(1d_ago.visit_count,0),
nvl(old.visit_day_count,0)+if(1d_ago.mid_id is null,0,1)
from
(
select
mid_id,
brand,
model,
channel,
os,
area_code,
version_code,
visit_date_first,
visit_date_last,
visit_last_1d_count,
visit_last_1d_day_count,
visit_last_7d_count,
visit_last_7d_day_count,
visit_last_30d_count,
visit_last_30d_day_count,
visit_count,
visit_day_count
from ${APP}.dwt_visitor_topic
where dt=date_add('$do_date',-1)
)old
full outer join
(
select
mid_id,
brand,
model,
is_new,
channel,
os,
area_code,
version_code,
visit_count
from ${APP}.dws_visitor_action_daycount
where dt='$do_date'
)1d_ago
on old.mid_id=1d_ago.mid_id
left join
(
select
mid_id,
brand,
model,
is_new,
channel,
os,
area_code,
version_code,
visit_count
from ${APP}.dws_visitor_action_daycount
where dt=date_add('$do_date',-7)
)7d_ago
on old.mid_id=7d_ago.mid_id
left join
(
select
mid_id,
brand,
model,
is_new,
channel,
os,
area_code,
version_code,
visit_count
from ${APP}.dws_visitor_action_daycount
where dt=date_add('$do_date',-30)
)30d_ago
on old.mid_id=30d_ago.mid_id;
"
dwt_user_topic="
insert overwrite table ${APP}.dwt_user_topic partition(dt='$do_date')
select
id,
login_date_first,--以用戶的創建日期作為首次登錄日期
nvl(login_date_last,date_add('$do_date',-1)),--若有歷史登錄記錄,則根據歷史記錄獲取末次登錄日期,否則統一指定一個日期
nvl(login_last_1d_count,0),
nvl(login_last_1d_day_count,0),
nvl(login_last_7d_count,0),
nvl(login_last_7d_day_count,0),
nvl(login_last_30d_count,0),
nvl(login_last_30d_day_count,0),
nvl(login_count,0),
nvl(login_day_count,0),
order_date_first,
order_date_last,
nvl(order_last_1d_count,0),
nvl(order_activity_last_1d_count,0),
nvl(order_activity_reduce_last_1d_amount,0),
nvl(order_coupon_last_1d_count,0),
nvl(order_coupon_reduce_last_1d_amount,0),
nvl(order_last_1d_original_amount,0),
nvl(order_last_1d_final_amount,0),
nvl(order_last_7d_count,0),
nvl(order_activity_last_7d_count,0),
nvl(order_activity_reduce_last_7d_amount,0),
nvl(order_coupon_last_7d_count,0),
nvl(order_coupon_reduce_last_7d_amount,0),
nvl(order_last_7d_original_amount,0),
nvl(order_last_7d_final_amount,0),
nvl(order_last_30d_count,0),
nvl(order_activity_last_30d_count,0),
nvl(order_activity_reduce_last_30d_amount,0),
nvl(order_coupon_last_30d_count,0),
nvl(order_coupon_reduce_last_30d_amount,0),
nvl(order_last_30d_original_amount,0),
nvl(order_last_30d_final_amount,0),
nvl(order_count,0),
nvl(order_activity_count,0),
nvl(order_activity_reduce_amount,0),
nvl(order_coupon_count,0),
nvl(order_coupon_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
payment_date_first,
payment_date_last,
nvl(payment_last_1d_count,0),
nvl(payment_last_1d_amount,0),
nvl(payment_last_7d_count,0),
nvl(payment_last_7d_amount,0),
nvl(payment_last_30d_count,0),
nvl(payment_last_30d_amount,0),
nvl(payment_count,0),
nvl(payment_amount,0),
nvl(refund_order_last_1d_count,0),
nvl(refund_order_last_1d_num,0),
nvl(refund_order_last_1d_amount,0),
nvl(refund_order_last_7d_count,0),
nvl(refund_order_last_7d_num,0),
nvl(refund_order_last_7d_amount,0),
nvl(refund_order_last_30d_count,0),
nvl(refund_order_last_30d_num,0),
nvl(refund_order_last_30d_amount,0),
nvl(refund_order_count,0),
nvl(refund_order_num,0),
nvl(refund_order_amount,0),
nvl(refund_payment_last_1d_count,0),
nvl(refund_payment_last_1d_num,0),
nvl(refund_payment_last_1d_amount,0),
nvl(refund_payment_last_7d_count,0),
nvl(refund_payment_last_7d_num,0),
nvl(refund_payment_last_7d_amount,0),
nvl(refund_payment_last_30d_count,0),
nvl(refund_payment_last_30d_num,0),
nvl(refund_payment_last_30d_amount,0),
nvl(refund_payment_count,0),
nvl(refund_payment_num,0),
nvl(refund_payment_amount,0),
nvl(cart_last_1d_count,0),
nvl(cart_last_7d_count,0),
nvl(cart_last_30d_count,0),
nvl(cart_count,0),
nvl(favor_last_1d_count,0),
nvl(favor_last_7d_count,0),
nvl(favor_last_30d_count,0),
nvl(favor_count,0),
nvl(coupon_last_1d_get_count,0),
nvl(coupon_last_1d_using_count,0),
nvl(coupon_last_1d_used_count,0),
nvl(coupon_last_7d_get_count,0),
nvl(coupon_last_7d_using_count,0),
nvl(coupon_last_7d_used_count,0),
nvl(coupon_last_30d_get_count,0),
nvl(coupon_last_30d_using_count,0),
nvl(coupon_last_30d_used_count,0),
nvl(coupon_get_count,0),
nvl(coupon_using_count,0),
nvl(coupon_used_count,0),
nvl(appraise_last_1d_good_count,0),
nvl(appraise_last_1d_mid_count,0),
nvl(appraise_last_1d_bad_count,0),
nvl(appraise_last_1d_default_count,0),
nvl(appraise_last_7d_good_count,0),
nvl(appraise_last_7d_mid_count,0),
nvl(appraise_last_7d_bad_count,0),
nvl(appraise_last_7d_default_count,0),
nvl(appraise_last_30d_good_count,0),
nvl(appraise_last_30d_mid_count,0),
nvl(appraise_last_30d_bad_count,0),
nvl(appraise_last_30d_default_count,0),
nvl(appraise_good_count,0),
nvl(appraise_mid_count,0),
nvl(appraise_bad_count,0),
nvl(appraise_default_count,0)
from
(
select
id,
date_format(create_time,'yyyy-MM-dd') login_date_first
from ${APP}.dim_user_info
where dt='9999-99-99'
)t1
left join
(
select
user_id user_id,
max(dt) login_date_last,
sum(if(dt='$do_date',login_count,0)) login_last_1d_count,
sum(if(dt='$do_date' and login_count>0,1,0)) login_last_1d_day_count,
sum(if(dt>=date_add('$do_date',-6),login_count,0)) login_last_7d_count,
sum(if(dt>=date_add('$do_date',-6) and login_count>0,1,0)) login_last_7d_day_count,
sum(if(dt>=date_add('$do_date',-29),login_count,0)) login_last_30d_count,
sum(if(dt>=date_add('$do_date',-29) and login_count>0,1,0)) login_last_30d_day_count,
sum(login_count) login_count,
sum(if(login_count>0,1,0)) login_day_count,
min(if(order_count>0,dt,null)) order_date_first,
max(if(order_count>0,dt,null)) order_date_last,
sum(if(dt='$do_date',order_count,0)) order_last_1d_count,
sum(if(dt='$do_date',order_activity_count,0)) order_activity_last_1d_count,
sum(if(dt='$do_date',order_activity_reduce_amount,0)) order_activity_reduce_last_1d_amount,
sum(if(dt='$do_date',order_coupon_count,0)) order_coupon_last_1d_count,
sum(if(dt='$do_date',order_coupon_reduce_amount,0)) order_coupon_reduce_last_1d_amount,
sum(if(dt='$do_date',order_original_amount,0)) order_last_1d_original_amount,
sum(if(dt='$do_date',order_final_amount,0)) order_last_1d_final_amount,
sum(if(dt>=date_add('$do_date',-6),order_count,0)) order_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),order_activity_count,0)) order_activity_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),order_activity_reduce_amount,0)) order_activity_reduce_last_7d_amount,
sum(if(dt>=date_add('$do_date',-6),order_coupon_count,0)) order_coupon_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),order_coupon_reduce_amount,0)) order_coupon_reduce_last_7d_amount,
sum(if(dt>=date_add('$do_date',-6),order_original_amount,0)) order_last_7d_original_amount,
sum(if(dt>=date_add('$do_date',-6),order_final_amount,0)) order_last_7d_final_amount,
sum(if(dt>=date_add('$do_date',-29),order_count,0)) order_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),order_activity_count,0)) order_activity_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),order_activity_reduce_amount,0)) order_activity_reduce_last_30d_amount,
sum(if(dt>=date_add('$do_date',-29),order_coupon_count,0)) order_coupon_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),order_coupon_reduce_amount,0)) order_coupon_reduce_last_30d_amount,
sum(if(dt>=date_add('$do_date',-29),order_original_amount,0)) order_last_30d_original_amount,
sum(if(dt>=date_add('$do_date',-29),order_final_amount,0)) order_last_30d_final_amount,
sum(order_count) order_count,
sum(order_activity_count) order_activity_count,
sum(order_activity_reduce_amount) order_activity_reduce_amount,
sum(order_coupon_count) order_coupon_count,
sum(order_coupon_reduce_amount) order_coupon_reduce_amount,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
min(if(payment_count>0,dt,null)) payment_date_first,
max(if(payment_count>0,dt,null)) payment_date_last,
sum(if(dt='$do_date',payment_count,0)) payment_last_1d_count,
sum(if(dt='$do_date',payment_amount,0)) payment_last_1d_amount,
sum(if(dt>=date_add('$do_date',-6),payment_count,0)) payment_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),payment_amount,0)) payment_last_7d_amount,
sum(if(dt>=date_add('$do_date',-29),payment_count,0)) payment_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),payment_amount,0)) payment_last_30d_amount,
sum(payment_count) payment_count,
sum(payment_amount) payment_amount,
sum(if(dt='$do_date',refund_order_count,0)) refund_order_last_1d_count,
sum(if(dt='$do_date',refund_order_num,0)) refund_order_last_1d_num,
sum(if(dt='$do_date',refund_order_amount,0)) refund_order_last_1d_amount,
sum(if(dt>=date_add('$do_date',-6),refund_order_count,0)) refund_order_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),refund_order_num,0)) refund_order_last_7d_num,
sum(if(dt>=date_add('$do_date',-6),refund_order_amount,0)) refund_order_last_7d_amount,
sum(if(dt>=date_add('$do_date',-29),refund_order_count,0)) refund_order_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),refund_order_num,0)) refund_order_last_30d_num,
sum(if(dt>=date_add('$do_date',-29),refund_order_amount,0)) refund_order_last_30d_amount,
sum(refund_order_count) refund_order_count,
sum(refund_order_num) refund_order_num,
sum(refund_order_amount) refund_order_amount,
sum(if(dt='$do_date',refund_payment_count,0)) refund_payment_last_1d_count,
sum(if(dt='$do_date',refund_payment_num,0)) refund_payment_last_1d_num,
sum(if(dt='$do_date',refund_payment_amount,0)) refund_payment_last_1d_amount,
sum(if(dt>=date_add('$do_date',-6),refund_payment_count,0)) refund_payment_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),refund_payment_num,0)) refund_payment_last_7d_num,
sum(if(dt>=date_add('$do_date',-6),refund_payment_amount,0)) refund_payment_last_7d_amount,
sum(if(dt>=date_add('$do_date',-29),refund_payment_count,0)) refund_payment_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),refund_payment_num,0)) refund_payment_last_30d_num,
sum(if(dt>=date_add('$do_date',-29),refund_payment_amount,0)) refund_payment_last_30d_amount,
sum(refund_payment_count) refund_payment_count,
sum(refund_payment_num) refund_payment_num,
sum(refund_payment_amount) refund_payment_amount,
sum(if(dt='$do_date',cart_count,0)) cart_last_1d_count,
sum(if(dt>=date_add('$do_date',-6),cart_count,0)) cart_last_7d_count,
sum(if(dt>=date_add('$do_date',-29),cart_count,0)) cart_last_30d_count,
sum(cart_count) cart_count,
sum(if(dt='$do_date',favor_count,0)) favor_last_1d_count,
sum(if(dt>=date_add('$do_date',-6),favor_count,0)) favor_last_7d_count,
sum(if(dt>=date_add('$do_date',-29),favor_count,0)) favor_last_30d_count,
sum(favor_count) favor_count,
sum(if(dt='$do_date',coupon_get_count,0)) coupon_last_1d_get_count,
sum(if(dt='$do_date',coupon_using_count,0)) coupon_last_1d_using_count,
sum(if(dt='$do_date',coupon_used_count,0)) coupon_last_1d_used_count,
sum(if(dt>=date_add('$do_date',-6),coupon_get_count,0)) coupon_last_7d_get_count,
sum(if(dt>=date_add('$do_date',-6),coupon_using_count,0)) coupon_last_7d_using_count,
sum(if(dt>=date_add('$do_date',-6),coupon_used_count,0)) coupon_last_7d_used_count,
sum(if(dt>=date_add('$do_date',-29),coupon_get_count,0)) coupon_last_30d_get_count,
sum(if(dt>=date_add('$do_date',-29),coupon_using_count,0)) coupon_last_30d_using_count,
sum(if(dt>=date_add('$do_date',-29),coupon_used_count,0)) coupon_last_30d_used_count,
sum(coupon_get_count) coupon_get_count,
sum(coupon_using_count) coupon_using_count,
sum(coupon_used_count) coupon_used_count,
sum(if(dt='$do_date',appraise_good_count,0)) appraise_last_1d_good_count,
sum(if(dt='$do_date',appraise_mid_count,0)) appraise_last_1d_mid_count,
sum(if(dt='$do_date',appraise_bad_count,0)) appraise_last_1d_bad_count,
sum(if(dt='$do_date',appraise_default_count,0)) appraise_last_1d_default_count,
sum(if(dt>=date_add('$do_date',-6),appraise_good_count,0)) appraise_last_7d_good_count,
sum(if(dt>=date_add('$do_date',-6),appraise_mid_count,0)) appraise_last_7d_mid_count,
sum(if(dt>=date_add('$do_date',-6),appraise_bad_count,0)) appraise_last_7d_bad_count,
sum(if(dt>=date_add('$do_date',-6),appraise_default_count,0)) appraise_last_7d_default_count,
sum(if(dt>=date_add('$do_date',-29),appraise_good_count,0)) appraise_last_30d_good_count,
sum(if(dt>=date_add('$do_date',-29),appraise_mid_count,0)) appraise_last_30d_mid_count,
sum(if(dt>=date_add('$do_date',-29),appraise_bad_count,0)) appraise_last_30d_bad_count,
sum(if(dt>=date_add('$do_date',-29),appraise_default_count,0)) appraise_last_30d_default_count,
sum(appraise_good_count) appraise_good_count,
sum(appraise_mid_count) appraise_mid_count,
sum(appraise_bad_count) appraise_bad_count,
sum(appraise_default_count) appraise_default_count
from ${APP}.dws_user_action_daycount
group by user_id
)t2
on t1.id=t2.user_id;
"
dwt_sku_topic="
insert overwrite table ${APP}.dwt_sku_topic partition(dt='$do_date')
select
id,
nvl(order_last_1d_count,0),
nvl(order_last_1d_num,0),
nvl(order_activity_last_1d_count,0),
nvl(order_coupon_last_1d_count,0),
nvl(order_activity_reduce_last_1d_amount,0),
nvl(order_coupon_reduce_last_1d_amount,0),
nvl(order_last_1d_original_amount,0),
nvl(order_last_1d_final_amount,0),
nvl(order_last_7d_count,0),
nvl(order_last_7d_num,0),
nvl(order_activity_last_7d_count,0),
nvl(order_coupon_last_7d_count,0),
nvl(order_activity_reduce_last_7d_amount,0),
nvl(order_coupon_reduce_last_7d_amount,0),
nvl(order_last_7d_original_amount,0),
nvl(order_last_7d_final_amount,0),
nvl(order_last_30d_count,0),
nvl(order_last_30d_num,0),
nvl(order_activity_last_30d_count,0),
nvl(order_coupon_last_30d_count,0),
nvl(order_activity_reduce_last_30d_amount,0),
nvl(order_coupon_reduce_last_30d_amount,0),
nvl(order_last_30d_original_amount,0),
nvl(order_last_30d_final_amount,0),
nvl(order_count,0),
nvl(order_num,0),
nvl(order_activity_count,0),
nvl(order_coupon_count,0),
nvl(order_activity_reduce_amount,0),
nvl(order_coupon_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_last_1d_count,0),
nvl(payment_last_1d_num,0),
nvl(payment_last_1d_amount,0),
nvl(payment_last_7d_count,0),
nvl(payment_last_7d_num,0),
nvl(payment_last_7d_amount,0),
nvl(payment_last_30d_count,0),
nvl(payment_last_30d_num,0),
nvl(payment_last_30d_amount,0),
nvl(payment_count,0),
nvl(payment_num,0),
nvl(payment_amount,0),
nvl(refund_order_last_1d_count,0),
nvl(refund_order_last_1d_num,0),
nvl(refund_order_last_1d_amount,0),
nvl(refund_order_last_7d_count,0),
nvl(refund_order_last_7d_num,0),
nvl(refund_order_last_7d_amount,0),
nvl(refund_order_last_30d_count,0),
nvl(refund_order_last_30d_num,0),
nvl(refund_order_last_30d_amount,0),
nvl(refund_order_count,0),
nvl(refund_order_num,0),
nvl(refund_order_amount,0),
nvl(refund_payment_last_1d_count,0),
nvl(refund_payment_last_1d_num,0),
nvl(refund_payment_last_1d_amount,0),
nvl(refund_payment_last_7d_count,0),
nvl(refund_payment_last_7d_num,0),
nvl(refund_payment_last_7d_amount,0),
nvl(refund_payment_last_30d_count,0),
nvl(refund_payment_last_30d_num,0),
nvl(refund_payment_last_30d_amount,0),
nvl(refund_payment_count,0),
nvl(refund_payment_num,0),
nvl(refund_payment_amount,0),
nvl(cart_last_1d_count,0),
nvl(cart_last_7d_count,0),
nvl(cart_last_30d_count,0),
nvl(cart_count,0),
nvl(favor_last_1d_count,0),
nvl(favor_last_7d_count,0),
nvl(favor_last_30d_count,0),
nvl(favor_count,0),
nvl(appraise_last_1d_good_count,0),
nvl(appraise_last_1d_mid_count,0),
nvl(appraise_last_1d_bad_count,0),
nvl(appraise_last_1d_default_count,0),
nvl(appraise_last_7d_good_count,0),
nvl(appraise_last_7d_mid_count,0),
nvl(appraise_last_7d_bad_count,0),
nvl(appraise_last_7d_default_count,0),
nvl(appraise_last_30d_good_count,0),
nvl(appraise_last_30d_mid_count,0),
nvl(appraise_last_30d_bad_count,0),
nvl(appraise_last_30d_default_count,0),
nvl(appraise_good_count,0),
nvl(appraise_mid_count,0),
nvl(appraise_bad_count,0),
nvl(appraise_default_count,0)
from
(
select
id
from ${APP}.dim_sku_info
where dt='$do_date'
)t1
left join
(
select
sku_id,
sum(if(dt='$do_date',order_count,0)) order_last_1d_count,
sum(if(dt='$do_date',order_num,0)) order_last_1d_num,
sum(if(dt='$do_date',order_activity_count,0)) order_activity_last_1d_count,
sum(if(dt='$do_date',order_coupon_count,0)) order_coupon_last_1d_count,
sum(if(dt='$do_date',order_activity_reduce_amount,0)) order_activity_reduce_last_1d_amount,
sum(if(dt='$do_date',order_coupon_reduce_amount,0)) order_coupon_reduce_last_1d_amount,
sum(if(dt='$do_date',order_original_amount,0)) order_last_1d_original_amount,
sum(if(dt='$do_date',order_final_amount,0)) order_last_1d_final_amount,
sum(if(dt>=date_add('$do_date',-6),order_count,0)) order_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),order_num,0)) order_last_7d_num,
sum(if(dt>=date_add('$do_date',-6),order_activity_count,0)) order_activity_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),order_coupon_count,0)) order_coupon_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),order_activity_reduce_amount,0)) order_activity_reduce_last_7d_amount,
sum(if(dt>=date_add('$do_date',-6),order_coupon_reduce_amount,0)) order_coupon_reduce_last_7d_amount,
sum(if(dt>=date_add('$do_date',-6),order_original_amount,0)) order_last_7d_original_amount,
sum(if(dt>=date_add('$do_date',-6),order_final_amount,0)) order_last_7d_final_amount,
sum(if(dt>=date_add('$do_date',-29),order_count,0)) order_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),order_num,0)) order_last_30d_num,
sum(if(dt>=date_add('$do_date',-29),order_activity_count,0)) order_activity_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),order_coupon_count,0)) order_coupon_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),order_activity_reduce_amount,0)) order_activity_reduce_last_30d_amount,
sum(if(dt>=date_add('$do_date',-29),order_coupon_reduce_amount,0)) order_coupon_reduce_last_30d_amount,
sum(if(dt>=date_add('$do_date',-29),order_original_amount,0)) order_last_30d_original_amount,
sum(if(dt>=date_add('$do_date',-29),order_final_amount,0)) order_last_30d_final_amount,
sum(order_count) order_count,
sum(order_num) order_num,
sum(order_activity_count) order_activity_count,
sum(order_coupon_count) order_coupon_count,
sum(order_activity_reduce_amount) order_activity_reduce_amount,
sum(order_coupon_reduce_amount) order_coupon_reduce_amount,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
sum(if(dt='$do_date',payment_count,0)) payment_last_1d_count,
sum(if(dt='$do_date',payment_num,0)) payment_last_1d_num,
sum(if(dt='$do_date',payment_amount,0)) payment_last_1d_amount,
sum(if(dt>=date_add('$do_date',-6),payment_count,0)) payment_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),payment_num,0)) payment_last_7d_num,
sum(if(dt>=date_add('$do_date',-6),payment_amount,0)) payment_last_7d_amount,
sum(if(dt>=date_add('$do_date',-29),payment_count,0)) payment_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),payment_num,0)) payment_last_30d_num,
sum(if(dt>=date_add('$do_date',-29),payment_amount,0)) payment_last_30d_amount,
sum(payment_count) payment_count,
sum(payment_num) payment_num,
sum(payment_amount) payment_amount,
sum(if(dt='$do_date',refund_order_count,0)) refund_order_last_1d_count,
sum(if(dt='$do_date',refund_order_num,0)) refund_order_last_1d_num,
sum(if(dt='$do_date',refund_order_amount,0)) refund_order_last_1d_amount,
sum(if(dt>=date_add('$do_date',-6),refund_order_count,0)) refund_order_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),refund_order_num,0)) refund_order_last_7d_num,
sum(if(dt>=date_add('$do_date',-6),refund_order_amount,0)) refund_order_last_7d_amount,
sum(if(dt>=date_add('$do_date',-29),refund_order_count,0)) refund_order_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),refund_order_num,0)) refund_order_last_30d_num,
sum(if(dt>=date_add('$do_date',-29),refund_order_amount,0)) refund_order_last_30d_amount,
sum(refund_order_count) refund_order_count,
sum(refund_order_num) refund_order_num,
sum(refund_order_amount) refund_order_amount,
sum(if(dt='$do_date',refund_payment_count,0)) refund_payment_last_1d_count,
sum(if(dt='$do_date',refund_payment_num,0)) refund_payment_last_1d_num,
sum(if(dt='$do_date',refund_payment_amount,0)) refund_payment_last_1d_amount,
sum(if(dt>=date_add('$do_date',-6),refund_payment_count,0)) refund_payment_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),refund_payment_num,0)) refund_payment_last_7d_num,
sum(if(dt>=date_add('$do_date',-6),refund_payment_amount,0)) refund_payment_last_7d_amount,
sum(if(dt>=date_add('$do_date',-29),refund_payment_count,0)) refund_payment_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),refund_payment_num,0)) refund_payment_last_30d_num,
sum(if(dt>=date_add('$do_date',-29),refund_payment_amount,0)) refund_payment_last_30d_amount,
sum(refund_payment_count) refund_payment_count,
sum(refund_payment_num) refund_payment_num,
sum(refund_payment_amount) refund_payment_amount,
sum(if(dt='$do_date',cart_count,0)) cart_last_1d_count,
sum(if(dt>=date_add('$do_date',-6),cart_count,0)) cart_last_7d_count,
sum(if(dt>=date_add('$do_date',-29),cart_count,0)) cart_last_30d_count,
sum(cart_count) cart_count,
sum(if(dt='$do_date',favor_count,0)) favor_last_1d_count,
sum(if(dt>=date_add('$do_date',-6),favor_count,0)) favor_last_7d_count,
sum(if(dt>=date_add('$do_date',-29),favor_count,0)) favor_last_30d_count,
sum(favor_count) favor_count,
sum(if(dt='$do_date',appraise_good_count,0)) appraise_last_1d_good_count,
sum(if(dt='$do_date',appraise_mid_count,0)) appraise_last_1d_mid_count,
sum(if(dt='$do_date',appraise_bad_count,0)) appraise_last_1d_bad_count,
sum(if(dt='$do_date',appraise_default_count,0)) appraise_last_1d_default_count,
sum(if(dt>=date_add('$do_date',-6),appraise_good_count,0)) appraise_last_7d_good_count,
sum(if(dt>=date_add('$do_date',-6),appraise_mid_count,0)) appraise_last_7d_mid_count,
sum(if(dt>=date_add('$do_date',-6),appraise_bad_count,0)) appraise_last_7d_bad_count,
sum(if(dt>=date_add('$do_date',-6),appraise_default_count,0)) appraise_last_7d_default_count,
sum(if(dt>=date_add('$do_date',-29),appraise_good_count,0)) appraise_last_30d_good_count,
sum(if(dt>=date_add('$do_date',-29),appraise_mid_count,0)) appraise_last_30d_mid_count,
sum(if(dt>=date_add('$do_date',-29),appraise_bad_count,0)) appraise_last_30d_bad_count,
sum(if(dt>=date_add('$do_date',-29),appraise_default_count,0)) appraise_last_30d_default_count,
sum(appraise_good_count) appraise_good_count,
sum(appraise_mid_count) appraise_mid_count,
sum(appraise_bad_count) appraise_bad_count,
sum(appraise_default_count) appraise_default_count
from ${APP}.dws_sku_action_daycount
group by sku_id
)t2
on t1.id=t2.sku_id;
"
dwt_coupon_topic="
insert overwrite table ${APP}.dwt_coupon_topic partition(dt='$do_date')
select
id,
nvl(get_last_1d_count,0),
nvl(get_last_7d_count,0),
nvl(get_last_30d_count,0),
nvl(get_count,0),
nvl(order_last_1d_count,0),
nvl(order_last_1d_reduce_amount,0),
nvl(order_last_1d_original_amount,0),
nvl(order_last_1d_final_amount,0),
nvl(order_last_7d_count,0),
nvl(order_last_7d_reduce_amount,0),
nvl(order_last_7d_original_amount,0),
nvl(order_last_7d_final_amount,0),
nvl(order_last_30d_count,0),
nvl(order_last_30d_reduce_amount,0),
nvl(order_last_30d_original_amount,0),
nvl(order_last_30d_final_amount,0),
nvl(order_count,0),
nvl(order_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_last_1d_count,0),
nvl(payment_last_1d_reduce_amount,0),
nvl(payment_last_1d_amount,0),
nvl(payment_last_7d_count,0),
nvl(payment_last_7d_reduce_amount,0),
nvl(payment_last_7d_amount,0),
nvl(payment_last_30d_count,0),
nvl(payment_last_30d_reduce_amount,0),
nvl(payment_last_30d_amount,0),
nvl(payment_count,0),
nvl(payment_reduce_amount,0),
nvl(payment_amount,0),
nvl(expire_last_1d_count,0),
nvl(expire_last_7d_count,0),
nvl(expire_last_30d_count,0),
nvl(expire_count,0)
from
(
select
id
from ${APP}.dim_coupon_info
where dt='$do_date'
)t1
left join
(
select
coupon_id coupon_id,
sum(if(dt='$do_date',get_count,0)) get_last_1d_count,
sum(if(dt>=date_add('$do_date',-6),get_count,0)) get_last_7d_count,
sum(if(dt>=date_add('$do_date',-29),get_count,0)) get_last_30d_count,
sum(get_count) get_count,
sum(if(dt='$do_date',order_count,0)) order_last_1d_count,
sum(if(dt='$do_date',order_reduce_amount,0)) order_last_1d_reduce_amount,
sum(if(dt='$do_date',order_original_amount,0)) order_last_1d_original_amount,
sum(if(dt='$do_date',order_final_amount,0)) order_last_1d_final_amount,
sum(if(dt>=date_add('$do_date',-6),order_count,0)) order_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),order_reduce_amount,0)) order_last_7d_reduce_amount,
sum(if(dt>=date_add('$do_date',-6),order_original_amount,0)) order_last_7d_original_amount,
sum(if(dt>=date_add('$do_date',-6),order_final_amount,0)) order_last_7d_final_amount,
sum(if(dt>=date_add('$do_date',-29),order_count,0)) order_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),order_reduce_amount,0)) order_last_30d_reduce_amount,
sum(if(dt>=date_add('$do_date',-29),order_original_amount,0)) order_last_30d_original_amount,
sum(if(dt>=date_add('$do_date',-29),order_final_amount,0)) order_last_30d_final_amount,
sum(order_count) order_count,
sum(order_reduce_amount) order_reduce_amount,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
sum(if(dt='$do_date',payment_count,0)) payment_last_1d_count,
sum(if(dt='$do_date',payment_reduce_amount,0)) payment_last_1d_reduce_amount,
sum(if(dt='$do_date',payment_amount,0)) payment_last_1d_amount,
sum(if(dt>=date_add('$do_date',-6),payment_count,0)) payment_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),payment_reduce_amount,0)) payment_last_7d_reduce_amount,
sum(if(dt>=date_add('$do_date',-6),payment_amount,0)) payment_last_7d_amount,
sum(if(dt>=date_add('$do_date',-29),payment_count,0)) payment_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),payment_reduce_amount,0)) payment_last_30d_reduce_amount,
sum(if(dt>=date_add('$do_date',-29),payment_amount,0)) payment_last_30d_amount,
sum(payment_count) payment_count,
sum(payment_reduce_amount) payment_reduce_amount,
sum(payment_amount) payment_amount,
sum(if(dt='$do_date',expire_count,0)) expire_last_1d_count,
sum(if(dt>=date_add('$do_date',-6),expire_count,0)) expire_last_7d_count,
sum(if(dt>=date_add('$do_date',-29),expire_count,0)) expire_last_30d_count,
sum(expire_count) expire_count
from ${APP}.dws_coupon_info_daycount
group by coupon_id
)t2
on t1.id=t2.coupon_id;
"
dwt_activity_topic="
insert overwrite table ${APP}.dwt_activity_topic partition(dt='$do_date')
select
t1.activity_rule_id,
t1.activity_id,
nvl(order_last_1d_count,0),
nvl(order_last_1d_reduce_amount,0),
nvl(order_last_1d_original_amount,0),
nvl(order_last_1d_final_amount,0),
nvl(order_count,0),
nvl(order_reduce_amount,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_last_1d_count,0),
nvl(payment_last_1d_reduce_amount,0),
nvl(payment_last_1d_amount,0),
nvl(payment_count,0),
nvl(payment_reduce_amount,0),
nvl(payment_amount,0)
from
(
select
activity_rule_id,
activity_id
from ${APP}.dim_activity_rule_info
where dt='$do_date'
)t1
left join
(
select
activity_rule_id,
activity_id,
sum(if(dt='$do_date',order_count,0)) order_last_1d_count,
sum(if(dt='$do_date',order_reduce_amount,0)) order_last_1d_reduce_amount,
sum(if(dt='$do_date',order_original_amount,0)) order_last_1d_original_amount,
sum(if(dt='$do_date',order_final_amount,0)) order_last_1d_final_amount,
sum(order_count) order_count,
sum(order_reduce_amount) order_reduce_amount,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
sum(if(dt='$do_date',payment_count,0)) payment_last_1d_count,
sum(if(dt='$do_date',payment_reduce_amount,0)) payment_last_1d_reduce_amount,
sum(if(dt='$do_date',payment_amount,0)) payment_last_1d_amount,
sum(payment_count) payment_count,
sum(payment_reduce_amount) payment_reduce_amount,
sum(payment_amount) payment_amount
from ${APP}.dws_activity_info_daycount
group by activity_rule_id,activity_id
)t2
on t1.activity_rule_id=t2.activity_rule_id
and t1.activity_id=t2.activity_id;
"
dwt_area_topic="
insert overwrite table ${APP}.dwt_area_topic partition(dt='$do_date')
select
id,
nvl(visit_last_1d_count,0),
nvl(login_last_1d_count,0),
nvl(visit_last_7d_count,0),
nvl(login_last_7d_count,0),
nvl(visit_last_30d_count,0),
nvl(login_last_30d_count,0),
nvl(visit_count,0),
nvl(login_count,0),
nvl(order_last_1d_count,0),
nvl(order_last_1d_original_amount,0),
nvl(order_last_1d_final_amount,0),
nvl(order_last_7d_count,0),
nvl(order_last_7d_original_amount,0),
nvl(order_last_7d_final_amount,0),
nvl(order_last_30d_count,0),
nvl(order_last_30d_original_amount,0),
nvl(order_last_30d_final_amount,0),
nvl(order_count,0),
nvl(order_original_amount,0),
nvl(order_final_amount,0),
nvl(payment_last_1d_count,0),
nvl(payment_last_1d_amount,0),
nvl(payment_last_7d_count,0),
nvl(payment_last_7d_amount,0),
nvl(payment_last_30d_count,0),
nvl(payment_last_30d_amount,0),
nvl(payment_count,0),
nvl(payment_amount,0),
nvl(refund_order_last_1d_count,0),
nvl(refund_order_last_1d_amount,0),
nvl(refund_order_last_7d_count,0),
nvl(refund_order_last_7d_amount,0),
nvl(refund_order_last_30d_count,0),
nvl(refund_order_last_30d_amount,0),
nvl(refund_order_count,0),
nvl(refund_order_amount,0),
nvl(refund_payment_last_1d_count,0),
nvl(refund_payment_last_1d_amount,0),
nvl(refund_payment_last_7d_count,0),
nvl(refund_payment_last_7d_amount,0),
nvl(refund_payment_last_30d_count,0),
nvl(refund_payment_last_30d_amount,0),
nvl(refund_payment_count,0),
nvl(refund_payment_amount,0)
from
(
select
id
from ${APP}.dim_base_province
)t1
left join
(
select
province_id province_id,
sum(if(dt='$do_date',visit_count,0)) visit_last_1d_count,
sum(if(dt='$do_date',login_count,0)) login_last_1d_count,
sum(if(dt>=date_add('$do_date',-6),visit_count,0)) visit_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),login_count,0)) login_last_7d_count,
sum(if(dt>=date_add('$do_date',-29),visit_count,0)) visit_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),login_count,0)) login_last_30d_count,
sum(visit_count) visit_count,
sum(login_count) login_count,
sum(if(dt='$do_date',order_count,0)) order_last_1d_count,
sum(if(dt='$do_date',order_original_amount,0)) order_last_1d_original_amount,
sum(if(dt='$do_date',order_final_amount,0)) order_last_1d_final_amount,
sum(if(dt>=date_add('$do_date',-6),order_count,0)) order_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),order_original_amount,0)) order_last_7d_original_amount,
sum(if(dt>=date_add('$do_date',-6),order_final_amount,0)) order_last_7d_final_amount,
sum(if(dt>=date_add('$do_date',-29),order_count,0)) order_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),order_original_amount,0)) order_last_30d_original_amount,
sum(if(dt>=date_add('$do_date',-29),order_final_amount,0)) order_last_30d_final_amount,
sum(order_count) order_count,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
sum(if(dt='$do_date',payment_count,0)) payment_last_1d_count,
sum(if(dt='$do_date',payment_amount,0)) payment_last_1d_amount,
sum(if(dt>=date_add('$do_date',-6),payment_count,0)) payment_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),payment_amount,0)) payment_last_7d_amount,
sum(if(dt>=date_add('$do_date',-29),payment_count,0)) payment_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),payment_amount,0)) payment_last_30d_amount,
sum(payment_count) payment_count,
sum(payment_amount) payment_amount,
sum(if(dt='$do_date',refund_order_count,0)) refund_order_last_1d_count,
sum(if(dt='$do_date',refund_order_amount,0)) refund_order_last_1d_amount,
sum(if(dt>=date_add('$do_date',-6),refund_order_count,0)) refund_order_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),refund_order_amount,0)) refund_order_last_7d_amount,
sum(if(dt>=date_add('$do_date',-29),refund_order_count,0)) refund_order_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),refund_order_amount,0)) refund_order_last_30d_amount,
sum(refund_order_count) refund_order_count,
sum(refund_order_amount) refund_order_amount,
sum(if(dt='$do_date',refund_payment_count,0)) refund_payment_last_1d_count,
sum(if(dt='$do_date',refund_payment_amount,0)) refund_payment_last_1d_amount,
sum(if(dt>=date_add('$do_date',-6),refund_payment_count,0)) refund_payment_last_7d_count,
sum(if(dt>=date_add('$do_date',-6),refund_payment_amount,0)) refund_payment_last_7d_amount,
sum(if(dt>=date_add('$do_date',-29),refund_payment_count,0)) refund_payment_last_30d_count,
sum(if(dt>=date_add('$do_date',-29),refund_payment_amount,0)) refund_payment_last_30d_amount,
sum(refund_payment_count) refund_payment_count,
sum(refund_payment_amount) refund_payment_amount
from ${APP}.dws_area_stats_daycount
group by province_id
)t2
on t1.id=t2.province_id;
"
case $1 in
"dwt_visitor_topic" )
hive -e "$dwt_visitor_topic"
;;
"dwt_user_topic" )
hive -e "$dwt_user_topic"
;;
"dwt_sku_topic" )
hive -e "$dwt_sku_topic"
;;
"dwt_activity_topic" )
hive -e "$dwt_activity_topic"
;;
"dwt_coupon_topic" )
hive -e "$dwt_coupon_topic"
;;
"dwt_area_topic" )
hive -e "$dwt_area_topic"
;;
"all" )
hive -e "$dwt_visitor_topic$dwt_user_topic$dwt_sku_topic$dwt_activity_topic$dwt_coupon_topic$dwt_area_topic"
;;
esac
(2)增加執行權限
[atguigu@hadoop102 bin]$ chmod +x dws_to_dwt_init.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 bin]$ dws_to_dwt_init.sh all 2020-06-14
(2)查看數據是否導入成功
5.8 DWT層每日數據導入腳本
1)編寫腳本
(1)在/home/atguigu/bin目錄下創建腳本dws_to_dwt.sh
[atguigu@hadoop102 bin]$ vim dws_to_dwt.sh
在腳本中填寫如下內容
#!/bin/bash
APP=gmall
# 如果是輸入的日期按照取輸入日期;如果沒輸入日期取當前時間的前一天
if [ -n "$2" ] ;then
do_date=$2
else
do_date=`date -d "-1 day" +%F`
fi
clear_date=`date -d "$do_date -2 day" +%F`
dwt_visitor_topic="
insert overwrite table ${APP}.dwt_visitor_topic partition(dt='$do_date')
select
nvl(1d_ago.mid_id,old.mid_id),
nvl(1d_ago.brand,old.brand),
nvl(1d_ago.model,old.model),
nvl(1d_ago.channel,old.channel),
nvl(1d_ago.os,old.os),
nvl(1d_ago.area_code,old.area_code),
nvl(1d_ago.version_code,old.version_code),
case when old.mid_id is null and 1d_ago.is_new=1 then '$do_date'
when old.mid_id is null and 1d_ago.is_new=0 then '2020-06-13'--無法獲取准確的首次登錄日期,給定一個數倉搭建日之前的日期
else old.visit_date_first end,
if(1d_ago.mid_id is not null,'$do_date',old.visit_date_last),
nvl(1d_ago.visit_count,0),
if(1d_ago.mid_id is null,0,1),
nvl(old.visit_last_7d_count,0)+nvl(1d_ago.visit_count,0)- nvl(7d_ago.visit_count,0),
nvl(old.visit_last_7d_day_count,0)+if(1d_ago.mid_id is null,0,1)- if(7d_ago.mid_id is null,0,1),
nvl(old.visit_last_30d_count,0)+nvl(1d_ago.visit_count,0)- nvl(30d_ago.visit_count,0),
nvl(old.visit_last_30d_day_count,0)+if(1d_ago.mid_id is null,0,1)- if(30d_ago.mid_id is null,0,1),
nvl(old.visit_count,0)+nvl(1d_ago.visit_count,0),
nvl(old.visit_day_count,0)+if(1d_ago.mid_id is null,0,1)
from
(
select
mid_id,
brand,
model,
channel,
os,
area_code,
version_code,
visit_date_first,
visit_date_last,
visit_last_1d_count,
visit_last_1d_day_count,
visit_last_7d_count,
visit_last_7d_day_count,
visit_last_30d_count,
visit_last_30d_day_count,
visit_count,
visit_day_count
from ${APP}.dwt_visitor_topic
where dt=date_add('$do_date',-1)
)old
full outer join
(
select
mid_id,
brand,
model,
is_new,
channel,
os,
area_code,
version_code,
visit_count
from ${APP}.dws_visitor_action_daycount
where dt='$do_date'
)1d_ago
on old.mid_id=1d_ago.mid_id
left join
(
select
mid_id,
brand,
model,
is_new,
channel,
os,
area_code,
version_code,
visit_count
from ${APP}.dws_visitor_action_daycount
where dt=date_add('$do_date',-7)
)7d_ago
on old.mid_id=7d_ago.mid_id
left join
(
select
mid_id,
brand,
model,
is_new,
channel,
os,
area_code,
version_code,
visit_count
from ${APP}.dws_visitor_action_daycount
where dt=date_add('$do_date',-30)
)30d_ago
on old.mid_id=30d_ago.mid_id;
alter table ${APP}.dwt_visitor_topic drop partition(dt='$clear_date');
"
dwt_user_topic="
insert overwrite table ${APP}.dwt_user_topic partition(dt='$do_date')
select
nvl(1d_ago.user_id,old.user_id),
nvl(old.login_date_first,'$do_date'),
if(1d_ago.user_id is not null,'$do_date',old.login_date_last),
nvl(1d_ago.login_count,0),
if(1d_ago.user_id is not null,1,0),
nvl(old.login_last_7d_count,0)+nvl(1d_ago.login_count,0)- nvl(7d_ago.login_count,0),
nvl(old.login_last_7d_day_count,0)+if(1d_ago.user_id is null,0,1)- if(7d_ago.user_id is null,0,1),
nvl(old.login_last_30d_count,0)+nvl(1d_ago.login_count,0)- nvl(30d_ago.login_count,0),
nvl(old.login_last_30d_day_count,0)+if(1d_ago.user_id is null,0,1)- if(30d_ago.user_id is null,0,1),
nvl(old.login_count,0)+nvl(1d_ago.login_count,0),
nvl(old.login_day_count,0)+if(1d_ago.user_id is not null,1,0),
if(old.order_date_first is null and 1d_ago.order_count>0, '$do_date', old.order_date_first),
if(1d_ago.order_count>0,'$do_date',old.order_date_last),
nvl(1d_ago.order_count,0),
nvl(1d_ago.order_activity_count,0),
nvl(1d_ago.order_activity_reduce_amount,0.0),
nvl(1d_ago.order_coupon_count,0),
nvl(1d_ago.order_coupon_reduce_amount,0.0),
nvl(1d_ago.order_original_amount,0.0),
nvl(1d_ago.order_final_amount,0.0),
nvl(old.order_last_7d_count,0)+nvl(1d_ago.order_count,0)- nvl(7d_ago.order_count,0),
nvl(old.order_activity_last_7d_count,0)+nvl(1d_ago.order_activity_count,0)- nvl(7d_ago.order_activity_count,0),
nvl(old.order_activity_reduce_last_7d_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0)- nvl(7d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_last_7d_count,0)+nvl(1d_ago.order_coupon_count,0)- nvl(7d_ago.order_coupon_count,0),
nvl(old.order_coupon_reduce_last_7d_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0)- nvl(7d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_last_7d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(7d_ago.order_original_amount,0.0),
nvl(old.order_last_7d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(7d_ago.order_final_amount,0.0),
nvl(old.order_last_30d_count,0)+nvl(1d_ago.order_count,0)- nvl(30d_ago.order_count,0),
nvl(old.order_activity_last_30d_count,0)+nvl(1d_ago.order_activity_count,0)- nvl(30d_ago.order_activity_count,0),
nvl(old.order_activity_reduce_last_30d_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0)- nvl(30d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_last_30d_count,0)+nvl(1d_ago.order_coupon_count,0)- nvl(30d_ago.order_coupon_count,0),
nvl(old.order_coupon_reduce_last_30d_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0)- nvl(30d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_last_30d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(30d_ago.order_original_amount,0.0),
nvl(old.order_last_30d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(30d_ago.order_final_amount,0.0),
nvl(old.order_count,0)+nvl(1d_ago.order_count,0),
nvl(old.order_activity_count,0)+nvl(1d_ago.order_activity_count,0),
nvl(old.order_activity_reduce_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_count,0)+nvl(1d_ago.order_coupon_count,0),
nvl(old.order_coupon_reduce_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0),
nvl(old.order_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0),
if(old.payment_date_first is null and 1d_ago.payment_count>0, '$do_date', old.payment_date_first),
if(1d_ago.payment_count>0,'$do_date',old.payment_date_last),
nvl(1d_ago.payment_count,0),
nvl(1d_ago.payment_amount,0.0),
nvl(old.payment_last_7d_count,0)+nvl(1d_ago.payment_count,0)-nvl(7d_ago.payment_count,0),
nvl(old.payment_last_7d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)-nvl(7d_ago.payment_amount,0.0),
nvl(old.payment_last_30d_count,0)+nvl(1d_ago.payment_count,0)-nvl(30d_ago.payment_count,0),
nvl(old.payment_last_30d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(30d_ago.payment_amount,0.0),
nvl(old.payment_count,0)+nvl(1d_ago.payment_count,0),
nvl(old.payment_amount,0.0)+nvl(1d_ago.payment_amount,0.0),
nvl(1d_ago.refund_order_count,0),
nvl(1d_ago.refund_order_num,0),
nvl(1d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_7d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(7d_ago.refund_order_count,0),
nvl(old.refund_order_last_7d_num,0)+nvl(1d_ago.refund_order_num, 0)- nvl(7d_ago.refund_order_num,0),
nvl(old.refund_order_last_7d_amount,0.0)+ nvl(1d_ago.refund_order_amount,0.0)- nvl(7d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_30d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(30d_ago.refund_order_count,0),
nvl(old.refund_order_last_30d_num,0)+nvl(1d_ago.refund_order_num, 0)- nvl(30d_ago.refund_order_num,0),
nvl(old.refund_order_last_30d_amount,0.0)+ nvl(1d_ago.refund_order_amount,0.0)- nvl(30d_ago.refund_order_amount,0.0),
nvl(old.refund_order_count,0)+nvl(1d_ago.refund_order_count,0),
nvl(old.refund_order_num,0)+nvl(1d_ago.refund_order_num,0),
nvl(old.refund_order_amount,0.0)+ nvl(1d_ago.refund_order_amount,0.0),
nvl(1d_ago.refund_payment_count,0),
nvl(1d_ago.refund_payment_num,0),
nvl(1d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_7d_count,0)+nvl(1d_ago.refund_payment_count,0)-nvl(7d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_7d_num,0)+nvl(1d_ago.refund_payment_num,0)- nvl(7d_ago.refund_payment_num,0),
nvl(old.refund_payment_last_7d_amount,0.0)+ nvl(1d_ago.refund_payment_amount,0.0)- nvl(7d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_30d_count,0)+nvl(1d_ago.refund_payment_count,0)-nvl(30d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_30d_num,0)+nvl(1d_ago.refund_payment_num,0)- nvl(30d_ago.refund_payment_num,0),
nvl(old.refund_payment_last_30d_amount,0.0)+ nvl(1d_ago.refund_payment_amount,0.0)- nvl(30d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_count,0)+nvl(1d_ago.refund_payment_count,0),
nvl(old.refund_payment_num,0)+nvl(1d_ago.refund_payment_num,0),
nvl(old.refund_payment_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0),
nvl(1d_ago.cart_count,0),
nvl(old.cart_last_7d_count,0)+nvl(1d_ago.cart_count,0)-nvl(7d_ago.cart_count,0),
nvl(old.cart_last_30d_count,0)+nvl(1d_ago.cart_count,0)-nvl(30d_ago.cart_count,0),
nvl(old.cart_count,0)+nvl(1d_ago.cart_count,0),
nvl(1d_ago.favor_count,0),
nvl(old.favor_last_7d_count,0)+nvl(1d_ago.favor_count,0)- nvl(7d_ago.favor_count,0),
nvl(old.favor_last_30d_count,0)+nvl(1d_ago.favor_count,0)- nvl(30d_ago.favor_count,0),
nvl(old.favor_count,0)+nvl(1d_ago.favor_count,0),
nvl(1d_ago.coupon_get_count,0),
nvl(1d_ago.coupon_using_count,0),
nvl(1d_ago.coupon_used_count,0),
nvl(old.coupon_last_7d_get_count,0)+nvl(1d_ago.coupon_get_count,0)- nvl(7d_ago.coupon_get_count,0),
nvl(old.coupon_last_7d_using_count,0)+nvl(1d_ago.coupon_using_count,0)- nvl(7d_ago.coupon_using_count,0),
nvl(old.coupon_last_7d_used_count,0)+ nvl(1d_ago.coupon_used_count,0)- nvl(7d_ago.coupon_used_count,0),
nvl(old.coupon_last_30d_get_count,0)+nvl(1d_ago.coupon_get_count,0)- nvl(30d_ago.coupon_get_count,0),
nvl(old.coupon_last_30d_using_count,0)+nvl(1d_ago.coupon_using_count,0)- nvl(30d_ago.coupon_using_count,0),
nvl(old.coupon_last_30d_used_count,0)+ nvl(1d_ago.coupon_used_count,0)- nvl(30d_ago.coupon_used_count,0),
nvl(old.coupon_get_count,0)+nvl(1d_ago.coupon_get_count,0),
nvl(old.coupon_using_count,0)+nvl(1d_ago.coupon_using_count,0),
nvl(old.coupon_used_count,0)+nvl(1d_ago.coupon_used_count,0),
nvl(1d_ago.appraise_good_count,0),
nvl(1d_ago.appraise_mid_count,0),
nvl(1d_ago.appraise_bad_count,0),
nvl(old.appraise_last_7d_default_count,0)+nvl(1d_ago.appraise_default_count,0)-nvl(7d_ago.appraise_default_count,0),
nvl(old.appraise_last_7d_good_count,0)+nvl(1d_ago.appraise_good_count,0)- nvl(7d_ago.appraise_good_count,0),
nvl(old.appraise_last_7d_mid_count,0)+nvl(1d_ago.appraise_mid_count,0)-nvl(7d_ago.appraise_mid_count,0),
nvl(old.appraise_last_7d_bad_count,0)+nvl(1d_ago.appraise_bad_count,0)-nvl(7d_ago.appraise_bad_count,0),
nvl(old.appraise_last_7d_default_count,0)+nvl(1d_ago.appraise_default_count,0)-nvl(7d_ago.appraise_default_count,0),
nvl(old.appraise_last_30d_good_count,0)+nvl(1d_ago.appraise_good_count,0)- nvl(30d_ago.appraise_good_count,0),
nvl(old.appraise_last_30d_mid_count,0)+nvl(1d_ago.appraise_mid_count,0)-nvl(30d_ago.appraise_mid_count,0),
nvl(old.appraise_last_30d_bad_count,0)+nvl(1d_ago.appraise_bad_count,0)-nvl(30d_ago.appraise_bad_count,0),
nvl(old.appraise_last_30d_default_count,0)+nvl(1d_ago.appraise_default_count,0)-nvl(30d_ago.appraise_default_count,0),
nvl(old.appraise_good_count,0)+nvl(1d_ago.appraise_good_count,0),
nvl(old.appraise_mid_count,0)+nvl(1d_ago.appraise_mid_count, 0),
nvl(old.appraise_bad_count,0)+nvl(1d_ago.appraise_bad_count,0),
nvl(old.appraise_default_count,0)+nvl(1d_ago.appraise_default_count,0)
from
(
select
user_id,
login_date_first,
login_date_last,
login_date_1d_count,
login_last_1d_day_count,
login_last_7d_count,
login_last_7d_day_count,
login_last_30d_count,
login_last_30d_day_count,
login_count,
login_day_count,
order_date_first,
order_date_last,
order_last_1d_count,
order_activity_last_1d_count,
order_activity_reduce_last_1d_amount,
order_coupon_last_1d_count,
order_coupon_reduce_last_1d_amount,
order_last_1d_original_amount,
order_last_1d_final_amount,
order_last_7d_count,
order_activity_last_7d_count,
order_activity_reduce_last_7d_amount,
order_coupon_last_7d_count,
order_coupon_reduce_last_7d_amount,
order_last_7d_original_amount,
order_last_7d_final_amount,
order_last_30d_count,
order_activity_last_30d_count,
order_activity_reduce_last_30d_amount,
order_coupon_last_30d_count,
order_coupon_reduce_last_30d_amount,
order_last_30d_original_amount,
order_last_30d_final_amount,
order_count,
order_activity_count,
order_activity_reduce_amount,
order_coupon_count,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_date_first,
payment_date_last,
payment_last_1d_count,
payment_last_1d_amount,
payment_last_7d_count,
payment_last_7d_amount,
payment_last_30d_count,
payment_last_30d_amount,
payment_count,
payment_amount,
refund_order_last_1d_count,
refund_order_last_1d_num,
refund_order_last_1d_amount,
refund_order_last_7d_count,
refund_order_last_7d_num,
refund_order_last_7d_amount,
refund_order_last_30d_count,
refund_order_last_30d_num,
refund_order_last_30d_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_last_1d_count,
refund_payment_last_1d_num,
refund_payment_last_1d_amount,
refund_payment_last_7d_count,
refund_payment_last_7d_num,
refund_payment_last_7d_amount,
refund_payment_last_30d_count,
refund_payment_last_30d_num,
refund_payment_last_30d_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
cart_last_1d_count,
cart_last_7d_count,
cart_last_30d_count,
cart_count,
favor_last_1d_count,
favor_last_7d_count,
favor_last_30d_count,
favor_count,
coupon_last_1d_get_count,
coupon_last_1d_using_count,
coupon_last_1d_used_count,
coupon_last_7d_get_count,
coupon_last_7d_using_count,
coupon_last_7d_used_count,
coupon_last_30d_get_count,
coupon_last_30d_using_count,
coupon_last_30d_used_count,
coupon_get_count,
coupon_using_count,
coupon_used_count,
appraise_last_1d_good_count,
appraise_last_1d_mid_count,
appraise_last_1d_bad_count,
appraise_last_1d_default_count,
appraise_last_7d_good_count,
appraise_last_7d_mid_count,
appraise_last_7d_bad_count,
appraise_last_7d_default_count,
appraise_last_30d_good_count,
appraise_last_30d_mid_count,
appraise_last_30d_bad_count,
appraise_last_30d_default_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from ${APP}.dwt_user_topic
where dt=date_add('$do_date',-1)
)old
full outer join
(
select
user_id,
login_count,
cart_count,
favor_count,
order_count,
order_activity_count,
order_activity_reduce_amount,
order_coupon_count,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
coupon_get_count,
coupon_using_count,
coupon_used_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from ${APP}.dws_user_action_daycount
where dt='$do_date'
)1d_ago
on old.user_id=1d_ago.user_id
left join
(
select
user_id,
login_count,
cart_count,
favor_count,
order_count,
order_activity_count,
order_activity_reduce_amount,
order_coupon_count,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
coupon_get_count,
coupon_using_count,
coupon_used_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from ${APP}.dws_user_action_daycount
where dt=date_add('$do_date',-7)
)7d_ago
on old.user_id=7d_ago.user_id
left join
(
select
user_id,
login_count,
cart_count,
favor_count,
order_count,
order_activity_count,
order_activity_reduce_amount,
order_coupon_count,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
coupon_get_count,
coupon_using_count,
coupon_used_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from ${APP}.dws_user_action_daycount
where dt=date_add('$do_date',-30)
)30d_ago
on old.user_id=30d_ago.user_id;
alter table ${APP}.dwt_user_topic drop partition(dt='$clear_date');
"
dwt_sku_topic="
insert overwrite table ${APP}.dwt_sku_topic partition(dt='$do_date')
select
nvl(1d_ago.sku_id,old.sku_id),
nvl(1d_ago.order_count,0),
nvl(1d_ago.order_num,0),
nvl(1d_ago.order_activity_count,0),
nvl(1d_ago.order_coupon_count,0),
nvl(1d_ago.order_activity_reduce_amount,0.0),
nvl(1d_ago.order_coupon_reduce_amount,0.0),
nvl(1d_ago.order_original_amount,0.0),
nvl(1d_ago.order_final_amount,0.0),
nvl(old.order_last_7d_count,0)+nvl(1d_ago.order_count,0)- nvl(7d_ago.order_count,0),
nvl(old.order_last_7d_num,0)+nvl(1d_ago.order_num,0)- nvl(7d_ago.order_num,0),
nvl(old.order_activity_last_7d_count,0)+nvl(1d_ago.order_activity_count,0)- nvl(7d_ago.order_activity_count,0),
nvl(old.order_coupon_last_7d_count,0)+nvl(1d_ago.order_coupon_count,0)- nvl(7d_ago.order_coupon_count,0),
nvl(old.order_activity_reduce_last_7d_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0)- nvl(7d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_reduce_last_7d_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0)- nvl(7d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_last_7d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(7d_ago.order_original_amount,0.0),
nvl(old.order_last_7d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(7d_ago.order_final_amount,0.0),
nvl(old.order_last_30d_count,0)+nvl(1d_ago.order_count,0)- nvl(30d_ago.order_count,0),
nvl(old.order_last_30d_num,0)+nvl(1d_ago.order_num,0)- nvl(30d_ago.order_num,0),
nvl(old.order_activity_last_30d_count,0)+nvl(1d_ago.order_activity_count,0)- nvl(30d_ago.order_activity_count,0),
nvl(old.order_coupon_last_30d_count,0)+nvl(1d_ago.order_coupon_count,0)- nvl(30d_ago.order_coupon_count,0),
nvl(old.order_activity_reduce_last_30d_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0)- nvl(30d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_reduce_last_30d_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0)- nvl(30d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_last_30d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(30d_ago.order_original_amount,0.0),
nvl(old.order_last_30d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(30d_ago.order_final_amount,0.0),
nvl(old.order_count,0)+nvl(1d_ago.order_count,0),
nvl(old.order_num,0)+nvl(1d_ago.order_num,0),
nvl(old.order_activity_count,0)+nvl(1d_ago.order_activity_count,0),
nvl(old.order_coupon_count,0)+nvl(1d_ago.order_coupon_count,0),
nvl(old.order_activity_reduce_amount,0.0)+nvl(1d_ago.order_activity_reduce_amount,0.0),
nvl(old.order_coupon_reduce_amount,0.0)+nvl(1d_ago.order_coupon_reduce_amount,0.0),
nvl(old.order_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0),
nvl(old.order_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0),
nvl(1d_ago.payment_count,0),
nvl(1d_ago.payment_num,0),
nvl(1d_ago.payment_amount,0.0),
nvl(old.payment_last_7d_count,0)+nvl(1d_ago.payment_count,0)- nvl(7d_ago.payment_count,0),
nvl(old.payment_last_7d_num,0)+nvl(1d_ago.payment_num,0)- nvl(7d_ago.payment_num,0),
nvl(old.payment_last_7d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(7d_ago.payment_amount,0.0),
nvl(old.payment_last_30d_count,0)+nvl(1d_ago.payment_count,0)- nvl(30d_ago.payment_count,0),
nvl(old.payment_last_30d_num,0)+nvl(1d_ago.payment_num,0)- nvl(30d_ago.payment_num,0),
nvl(old.payment_last_30d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(30d_ago.payment_amount,0.0),
nvl(old.payment_count,0)+nvl(1d_ago.payment_count,0),
nvl(old.payment_num,0)+nvl(1d_ago.payment_num,0),
nvl(old.payment_amount,0.0)+nvl(1d_ago.payment_amount,0.0),
nvl(old.refund_order_last_1d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(1d_ago.refund_order_count,0),
nvl(old.refund_order_last_1d_num,0)+nvl(1d_ago.refund_order_num,0)- nvl(1d_ago.refund_order_num,0),
nvl(old.refund_order_last_1d_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0)- nvl(1d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_7d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(7d_ago.refund_order_count,0),
nvl(old.refund_order_last_7d_num,0)+nvl(1d_ago.refund_order_num,0)- nvl(7d_ago.refund_order_num,0),
nvl(old.refund_order_last_7d_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0)- nvl(7d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_30d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(30d_ago.refund_order_count,0),
nvl(old.refund_order_last_30d_num,0)+nvl(1d_ago.refund_order_num,0)- nvl(30d_ago.refund_order_num,0),
nvl(old.refund_order_last_30d_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0)- nvl(30d_ago.refund_order_amount,0.0),
nvl(old.refund_order_count,0)+nvl(1d_ago.refund_order_count,0),
nvl(old.refund_order_num,0)+nvl(1d_ago.refund_order_num,0),
nvl(old.refund_order_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0),
nvl(1d_ago.refund_payment_count,0),
nvl(1d_ago.refund_payment_num,0),
nvl(1d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_7d_count,0)+nvl(1d_ago.refund_payment_count,0)- nvl(7d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_7d_num,0)+nvl(1d_ago.refund_payment_num,0)- nvl(7d_ago.refund_payment_num,0),
nvl(old.refund_payment_last_7d_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0)- nvl(7d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_30d_count,0)+nvl(1d_ago.refund_payment_count,0)- nvl(30d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_30d_num,0)+nvl(1d_ago.refund_payment_num,0)- nvl(30d_ago.refund_payment_num,0),
nvl(old.refund_payment_last_30d_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0)- nvl(30d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_count,0)+nvl(1d_ago.refund_payment_count,0),
nvl(old.refund_payment_num,0)+nvl(1d_ago.refund_payment_num,0),
nvl(old.refund_payment_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0),
nvl(1d_ago.cart_count,0),
nvl(old.cart_last_7d_count,0)+nvl(1d_ago.cart_count,0)- nvl(7d_ago.cart_count,0),
nvl(old.cart_last_30d_count,0)+nvl(1d_ago.cart_count,0)- nvl(30d_ago.cart_count,0),
nvl(old.cart_count,0)+nvl(1d_ago.cart_count,0),
nvl(1d_ago.favor_count,0),
nvl(old.favor_last_7d_count,0)+nvl(1d_ago.favor_count,0)- nvl(7d_ago.favor_count,0),
nvl(old.favor_last_30d_count,0)+nvl(1d_ago.favor_count,0)- nvl(30d_ago.favor_count,0),
nvl(old.favor_count,0)+nvl(1d_ago.favor_count,0),
nvl(1d_ago.appraise_good_count,0),
nvl(1d_ago.appraise_mid_count,0),
nvl(1d_ago.appraise_bad_count,0),
nvl(1d_ago.appraise_default_count,0),
nvl(old.appraise_last_7d_good_count,0)+nvl(1d_ago.appraise_good_count,0)- nvl(7d_ago.appraise_good_count,0),
nvl(old.appraise_last_7d_mid_count,0)+nvl(1d_ago.appraise_mid_count,0)- nvl(7d_ago.appraise_mid_count,0),
nvl(old.appraise_last_7d_bad_count,0)+nvl(1d_ago.appraise_bad_count,0)- nvl(7d_ago.appraise_bad_count,0),
nvl(old.appraise_last_7d_default_count,0)+nvl(1d_ago.appraise_default_count,0)- nvl(7d_ago.appraise_default_count,0),
nvl(old.appraise_last_30d_good_count,0)+nvl(1d_ago.appraise_good_count,0)- nvl(30d_ago.appraise_good_count,0),
nvl(old.appraise_last_30d_mid_count,0)+nvl(1d_ago.appraise_mid_count,0)- nvl(30d_ago.appraise_mid_count,0),
nvl(old.appraise_last_30d_bad_count,0)+nvl(1d_ago.appraise_bad_count,0)- nvl(30d_ago.appraise_bad_count,0),
nvl(old.appraise_last_30d_default_count,0)+nvl(1d_ago.appraise_default_count,0)- nvl(30d_ago.appraise_default_count,0),
nvl(old.appraise_good_count,0)+nvl(1d_ago.appraise_good_count,0),
nvl(old.appraise_mid_count,0)+nvl(1d_ago.appraise_mid_count,0),
nvl(old.appraise_bad_count,0)+nvl(1d_ago.appraise_bad_count,0),
nvl(old.appraise_default_count,0)+nvl(1d_ago.appraise_default_count,0)
from
(
select
sku_id,
order_last_1d_count,
order_last_1d_num,
order_activity_last_1d_count,
order_coupon_last_1d_count,
order_activity_reduce_last_1d_amount,
order_coupon_reduce_last_1d_amount,
order_last_1d_original_amount,
order_last_1d_final_amount,
order_last_7d_count,
order_last_7d_num,
order_activity_last_7d_count,
order_coupon_last_7d_count,
order_activity_reduce_last_7d_amount,
order_coupon_reduce_last_7d_amount,
order_last_7d_original_amount,
order_last_7d_final_amount,
order_last_30d_count,
order_last_30d_num,
order_activity_last_30d_count,
order_coupon_last_30d_count,
order_activity_reduce_last_30d_amount,
order_coupon_reduce_last_30d_amount,
order_last_30d_original_amount,
order_last_30d_final_amount,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_last_1d_count,
payment_last_1d_num,
payment_last_1d_amount,
payment_last_7d_count,
payment_last_7d_num,
payment_last_7d_amount,
payment_last_30d_count,
payment_last_30d_num,
payment_last_30d_amount,
payment_count,
payment_num,
payment_amount,
refund_order_last_1d_count,
refund_order_last_1d_num,
refund_order_last_1d_amount,
refund_order_last_7d_count,
refund_order_last_7d_num,
refund_order_last_7d_amount,
refund_order_last_30d_count,
refund_order_last_30d_num,
refund_order_last_30d_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_last_1d_count,
refund_payment_last_1d_num,
refund_payment_last_1d_amount,
refund_payment_last_7d_count,
refund_payment_last_7d_num,
refund_payment_last_7d_amount,
refund_payment_last_30d_count,
refund_payment_last_30d_num,
refund_payment_last_30d_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
cart_last_1d_count,
cart_last_7d_count,
cart_last_30d_count,
cart_count,
favor_last_1d_count,
favor_last_7d_count,
favor_last_30d_count,
favor_count,
appraise_last_1d_good_count,
appraise_last_1d_mid_count,
appraise_last_1d_bad_count,
appraise_last_1d_default_count,
appraise_last_7d_good_count,
appraise_last_7d_mid_count,
appraise_last_7d_bad_count,
appraise_last_7d_default_count,
appraise_last_30d_good_count,
appraise_last_30d_mid_count,
appraise_last_30d_bad_count,
appraise_last_30d_default_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from ${APP}.dwt_sku_topic
where dt=date_add('$do_date',-1)
)old
full outer join
(
select
sku_id,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_num,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
cart_count,
favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from ${APP}.dws_sku_action_daycount
where dt='$do_date'
)1d_ago
on old.sku_id=1d_ago.sku_id
left join
(
select
sku_id,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_num,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
cart_count,
favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from ${APP}.dws_sku_action_daycount
where dt=date_add('$do_date',-7)
)7d_ago
on old.sku_id=7d_ago.sku_id
left join
(
select
sku_id,
order_count,
order_num,
order_activity_count,
order_coupon_count,
order_activity_reduce_amount,
order_coupon_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_num,
payment_amount,
refund_order_count,
refund_order_num,
refund_order_amount,
refund_payment_count,
refund_payment_num,
refund_payment_amount,
cart_count,
favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from ${APP}.dws_sku_action_daycount
where dt=date_add('$do_date',-30)
)30d_ago
on old.sku_id=30d_ago.sku_id;
alter table ${APP}.dwt_sku_topic drop partition(dt='$clear_date');
"
dwt_activity_topic="
insert overwrite table ${APP}.dwt_activity_topic partition(dt='$do_date')
select
nvl(1d_ago.activity_rule_id,old.activity_rule_id),
nvl(1d_ago.activity_id,old.activity_id),
nvl(1d_ago.order_count,0),
nvl(1d_ago.order_reduce_amount,0.0),
nvl(1d_ago.order_original_amount,0.0),
nvl(1d_ago.order_final_amount,0.0),
nvl(old.order_count,0)+nvl(1d_ago.order_count,0),
nvl(old.order_reduce_amount,0.0)+nvl(1d_ago.order_reduce_amount,0.0),
nvl(old.order_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0),
nvl(old.order_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0),
nvl(1d_ago.payment_count,0),
nvl(1d_ago.payment_reduce_amount,0.0),
nvl(1d_ago.payment_amount,0.0),
nvl(old.payment_count,0)+nvl(1d_ago.payment_count,0),
nvl(old.payment_reduce_amount,0.0)+nvl(1d_ago.payment_reduce_amount,0.0),
nvl(old.payment_amount,0.0)+nvl(1d_ago.payment_amount,0.0)
from
(
select
activity_rule_id,
activity_id,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount
from ${APP}.dwt_activity_topic
where dt=date_add('$do_date',-1)
)old
full outer join
(
select
activity_rule_id,
activity_id,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount
from ${APP}.dws_activity_info_daycount
where dt='$do_date'
)1d_ago
on old.activity_rule_id=1d_ago.activity_rule_id;
alter table ${APP}.dwt_activity_topic drop partition(dt='$clear_date');
"
dwt_coupon_topic="
insert overwrite table ${APP}.dwt_coupon_topic partition(dt='$do_date')
select
nvl(1d_ago.coupon_id,old.coupon_id),
nvl(1d_ago.get_count,0),
nvl(old.get_last_7d_count,0)+nvl(1d_ago.get_count,0)- nvl(7d_ago.get_count,0),
nvl(old.get_last_30d_count,0)+nvl(1d_ago.get_count,0)- nvl(30d_ago.get_count,0),
nvl(old.get_count,0)+nvl(1d_ago.get_count,0),
nvl(1d_ago.order_count,0),
nvl(1d_ago.order_reduce_amount,0.0),
nvl(1d_ago.order_original_amount,0.0),
nvl(1d_ago.order_final_amount,0.0),
nvl(old.order_last_7d_count,0)+nvl(1d_ago.order_count,0)- nvl(7d_ago.order_count,0),
nvl(old.order_last_7d_reduce_amount,0.0)+nvl(1d_ago.order_reduce_amount,0.0)- nvl(7d_ago.order_reduce_amount,0.0),
nvl(old.order_last_7d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(7d_ago.order_original_amount,0.0),
nvl(old.order_last_7d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(7d_ago.order_final_amount,0.0),
nvl(old.order_last_30d_count,0)+nvl(1d_ago.order_count,0)- nvl(30d_ago.order_count,0),
nvl(old.order_last_30d_reduce_amount,0.0)+nvl(1d_ago.order_reduce_amount,0.0)- nvl(30d_ago.order_reduce_amount,0.0),
nvl(old.order_last_30d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(30d_ago.order_original_amount,0.0),
nvl(old.order_last_30d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(30d_ago.order_final_amount,0.0),
nvl(old.order_count,0)+nvl(1d_ago.order_count,0),
nvl(old.order_reduce_amount,0.0)+nvl(1d_ago.order_reduce_amount,0.0),
nvl(old.order_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0),
nvl(old.order_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0),
nvl(old.payment_last_1d_count,0)+nvl(1d_ago.payment_count,0)- nvl(1d_ago.payment_count,0),
nvl(old.payment_last_1d_reduce_amount,0.0)+nvl(1d_ago.payment_reduce_amount,0.0)- nvl(1d_ago.payment_reduce_amount,0.0),
nvl(old.payment_last_1d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(1d_ago.payment_amount,0.0),
nvl(old.payment_last_7d_count,0)+nvl(1d_ago.payment_count,0)- nvl(7d_ago.payment_count,0),
nvl(old.payment_last_7d_reduce_amount,0.0)+nvl(1d_ago.payment_reduce_amount,0.0)- nvl(7d_ago.payment_reduce_amount,0.0),
nvl(old.payment_last_7d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(7d_ago.payment_amount,0.0),
nvl(old.payment_last_30d_count,0)+nvl(1d_ago.payment_count,0)- nvl(30d_ago.payment_count,0),
nvl(old.payment_last_30d_reduce_amount,0.0)+nvl(1d_ago.payment_reduce_amount,0.0)- nvl(30d_ago.payment_reduce_amount,0.0),
nvl(old.payment_last_30d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(30d_ago.payment_amount,0.0),
nvl(old.payment_count,0)+nvl(1d_ago.payment_count,0),
nvl(old.payment_reduce_amount,0.0)+nvl(1d_ago.payment_reduce_amount,0.0),
nvl(old.payment_amount,0.0)+nvl(1d_ago.payment_amount,0.0),
nvl(1d_ago.expire_count,0),
nvl(old.expire_last_7d_count,0)+nvl(1d_ago.expire_count,0)- nvl(7d_ago.expire_count,0),
nvl(old.expire_last_30d_count,0)+nvl(1d_ago.expire_count,0)- nvl(30d_ago.expire_count,0),
nvl(old.expire_count,0)+nvl(1d_ago.expire_count,0)
from
(
select
coupon_id,
get_last_1d_count,
get_last_7d_count,
get_last_30d_count,
get_count,
order_last_1d_count,
order_last_1d_reduce_amount,
order_last_1d_original_amount,
order_last_1d_final_amount,
order_last_7d_count,
order_last_7d_reduce_amount,
order_last_7d_original_amount,
order_last_7d_final_amount,
order_last_30d_count,
order_last_30d_reduce_amount,
order_last_30d_original_amount,
order_last_30d_final_amount,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_last_1d_count,
payment_last_1d_reduce_amount,
payment_last_1d_amount,
payment_last_7d_count,
payment_last_7d_reduce_amount,
payment_last_7d_amount,
payment_last_30d_count,
payment_last_30d_reduce_amount,
payment_last_30d_amount,
payment_count,
payment_reduce_amount,
payment_amount,
expire_last_1d_count,
expire_last_7d_count,
expire_last_30d_count,
expire_count
from ${APP}.dwt_coupon_topic
where dt=date_add('$do_date',-1)
)old
full outer join
(
select
coupon_id,
get_count,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount,
expire_count
from ${APP}.dws_coupon_info_daycount
where dt='$do_date'
)1d_ago
on old.coupon_id=1d_ago.coupon_id
left join
(
select
coupon_id,
get_count,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount,
expire_count
from ${APP}.dws_coupon_info_daycount
where dt=date_add('$do_date',-7)
)7d_ago
on old.coupon_id=7d_ago.coupon_id
left join
(
select
coupon_id,
get_count,
order_count,
order_reduce_amount,
order_original_amount,
order_final_amount,
payment_count,
payment_reduce_amount,
payment_amount,
expire_count
from ${APP}.dws_coupon_info_daycount
where dt=date_add('$do_date',-30)
)30d_ago
on old.coupon_id=30d_ago.coupon_id;
alter table ${APP}.dwt_coupon_topic drop partition(dt='$clear_date');
"
dwt_area_topic="
insert overwrite table ${APP}.dwt_area_topic partition(dt='$do_date')
select
nvl(old.province_id, 1d_ago.province_id),
nvl(1d_ago.visit_count,0),
nvl(1d_ago.login_count,0),
nvl(old.visit_last_7d_count,0)+nvl(1d_ago.visit_count,0)- nvl(7d_ago.visit_count,0),
nvl(old.login_last_7d_count,0)+nvl(1d_ago.login_count,0)- nvl(7d_ago.login_count,0),
nvl(old.visit_last_30d_count,0)+nvl(1d_ago.visit_count,0)- nvl(30d_ago.visit_count,0),
nvl(old.login_last_30d_count,0)+nvl(1d_ago.login_count,0)- nvl(30d_ago.login_count,0),
nvl(old.visit_count,0)+nvl(1d_ago.visit_count,0),
nvl(old.login_count,0)+nvl(1d_ago.login_count,0),
nvl(1d_ago.order_count,0),
nvl(1d_ago.order_original_amount,0.0),
nvl(1d_ago.order_final_amount,0.0),
nvl(old.order_last_7d_count,0)+nvl(1d_ago.order_count,0)- nvl(7d_ago.order_count,0),
nvl(old.order_last_7d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(7d_ago.order_original_amount,0.0),
nvl(old.order_last_7d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(7d_ago.order_final_amount,0.0),
nvl(old.order_last_30d_count,0)+nvl(1d_ago.order_count,0)- nvl(30d_ago.order_count,0),
nvl(old.order_last_30d_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0)- nvl(30d_ago.order_original_amount,0.0),
nvl(old.order_last_30d_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0)- nvl(30d_ago.order_final_amount,0.0),
nvl(old.order_count,0)+nvl(1d_ago.order_count,0),
nvl(old.order_original_amount,0.0)+nvl(1d_ago.order_original_amount,0.0),
nvl(old.order_final_amount,0.0)+nvl(1d_ago.order_final_amount,0.0),
nvl(1d_ago.payment_count,0),
nvl(1d_ago.payment_amount,0.0),
nvl(old.payment_last_7d_count,0)+nvl(1d_ago.payment_count,0)- nvl(7d_ago.payment_count,0),
nvl(old.payment_last_7d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(7d_ago.payment_amount,0.0),
nvl(old.payment_last_30d_count,0)+nvl(1d_ago.payment_count,0)- nvl(30d_ago.payment_count,0),
nvl(old.payment_last_30d_amount,0.0)+nvl(1d_ago.payment_amount,0.0)- nvl(30d_ago.payment_amount,0.0),
nvl(old.payment_count,0)+nvl(1d_ago.payment_count,0),
nvl(old.payment_amount,0.0)+nvl(1d_ago.payment_amount,0.0),
nvl(1d_ago.refund_order_count,0),
nvl(1d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_7d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(7d_ago.refund_order_count,0),
nvl(old.refund_order_last_7d_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0)- nvl(7d_ago.refund_order_amount,0.0),
nvl(old.refund_order_last_30d_count,0)+nvl(1d_ago.refund_order_count,0)- nvl(30d_ago.refund_order_count,0),
nvl(old.refund_order_last_30d_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0)- nvl(30d_ago.refund_order_amount,0.0),
nvl(old.refund_order_count,0)+nvl(1d_ago.refund_order_count,0),
nvl(old.refund_order_amount,0.0)+nvl(1d_ago.refund_order_amount,0.0),
nvl(1d_ago.refund_payment_count,0),
nvl(1d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_7d_count,0)+nvl(1d_ago.refund_payment_count,0)- nvl(7d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_7d_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0)- nvl(7d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_last_30d_count,0)+nvl(1d_ago.refund_payment_count,0)- nvl(30d_ago.refund_payment_count,0),
nvl(old.refund_payment_last_30d_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0)- nvl(30d_ago.refund_payment_amount,0.0),
nvl(old.refund_payment_count,0)+nvl(1d_ago.refund_payment_count,0),
nvl(old.refund_payment_amount,0.0)+nvl(1d_ago.refund_payment_amount,0.0)
from
(
select
province_id,
visit_last_1d_count,
login_last_1d_count,
visit_last_7d_count,
login_last_7d_count,
visit_last_30d_count,
login_last_30d_count,
visit_count,
login_count,
order_last_1d_count,
order_last_1d_original_amount,
order_last_1d_final_amount,
order_last_7d_count,
order_last_7d_original_amount,
order_last_7d_final_amount,
order_last_30d_count,
order_last_30d_original_amount,
order_last_30d_final_amount,
order_count,
order_original_amount,
order_final_amount,
payment_last_1d_count,
payment_last_1d_amount,
payment_last_7d_count,
payment_last_7d_amount,
payment_last_30d_count,
payment_last_30d_amount,
payment_count,
payment_amount,
refund_order_last_1d_count,
refund_order_last_1d_amount,
refund_order_last_7d_count,
refund_order_last_7d_amount,
refund_order_last_30d_count,
refund_order_last_30d_amount,
refund_order_count,
refund_order_amount,
refund_payment_last_1d_count,
refund_payment_last_1d_amount,
refund_payment_last_7d_count,
refund_payment_last_7d_amount,
refund_payment_last_30d_count,
refund_payment_last_30d_amount,
refund_payment_count,
refund_payment_amount
from ${APP}.dwt_area_topic
where dt=date_add('$do_date',-1)
)old
full outer join
(
select
province_id,
visit_count,
login_count,
order_count,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_amount,
refund_payment_count,
refund_payment_amount
from ${APP}.dws_area_stats_daycount
where dt='$do_date'
)1d_ago
on old.province_id=1d_ago.province_id
left join
(
select
province_id,
visit_count,
login_count,
order_count,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_amount,
refund_payment_count,
refund_payment_amount
from ${APP}.dws_area_stats_daycount
where dt=date_add('$do_date',-7)
)7d_ago
on old.province_id= 7d_ago.province_id
left join
(
select
province_id,
visit_count,
login_count,
order_count,
order_original_amount,
order_final_amount,
payment_count,
payment_amount,
refund_order_count,
refund_order_amount,
refund_payment_count,
refund_payment_amount
from ${APP}.dws_area_stats_daycount
where dt=date_add('$do_date',-30)
)30d_ago
on old.province_id= 30d_ago.province_id;
alter table ${APP}.dwt_area_topic drop partition(dt='$clear_date');
"
case $1 in
"dwt_visitor_topic" )
hive -e "$dwt_visitor_topic"
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_visitor_topic/dt=$clear_date
;;
"dwt_user_topic" )
hive -e "$dwt_user_topic"
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_user_topic/dt=$clear_date
;;
"dwt_sku_topic" )
hive -e "$dwt_sku_topic"
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_sku_topic/dt=$clear_date
;;
"dwt_activity_topic" )
hive -e "$dwt_activity_topic"
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_activity_topic/dt=$clear_date
;;
"dwt_coupon_topic" )
hive -e "$dwt_coupon_topic"
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_coupon_topic/dt=$clear_date
;;
"dwt_area_topic" )
hive -e "$dwt_area_topic"
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_area_topic/dt=$clear_date
;;
"all" )
hive -e "$dwt_visitor_topic$dwt_user_topic$dwt_sku_topic$dwt_activity_topic$dwt_coupon_topic$dwt_area_topic"
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_visitor_topic/dt=$clear_date
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_user_topic/dt=$clear_date
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_sku_topic/dt=$clear_date
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_activity_topic/dt=$clear_date
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_coupon_topic/dt=$clear_date
hadoop fs -rm -r -f /warehouse/gmall/dwt/dwt_area_topic/dt=$clear_date
;;
esac
(2)增加腳本執行權限
[atguigu@hadoop102 bin]$ chmod 777 dws_to_dwt.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 bin]$ dws_to_dwt.sh 2020-06-14
(2)查看導入數據
第六章 數倉搭建-ADS層
建表說明
ADS層不涉及建模,建表根據具體需求而定。
6.1 訪客主題
6.1.1 訪客統計
源自的表:
dwd_page_log 頁面日志表
dwt_visitor_topic 設備主題寬表
該需求為訪客綜合統計,其中包含若干指標,以下為對每個指標的解釋說明。
| 指標 | 說明 | 對應字段 |
|---|---|---|
| 訪客數 | 統計訪問人數 | uv_count |
| 頁面停留時長 | 統計所有頁面訪問記錄總時長,以秒為單位 | duration_sec |
| 平均頁面停留時長 | 統計每個會話平均停留時長,以秒為單位 | avg_duration_sec |
| 頁面瀏覽總數 | 統計所有頁面訪問記錄總數 | page_count |
| 平均頁面瀏覽數 | 統計每個會話平均瀏覽頁面數 | avg_page_count |
| 會話總數 | 統計會話總數 | sv_count |
| 跳出數 | 統計只瀏覽一個頁面的會話個數 | bounce_count |
| 跳出率 | 只有一個頁面的會話的比例 | bounce_rate |
1.建表語句
DROP TABLE IF EXISTS ads_visit_stats;
CREATE EXTERNAL TABLE ads_visit_stats (
`dt` STRING COMMENT '統計日期',
`is_new` STRING COMMENT '新老標識,1:新,0:老',
`recent_days` BIGINT COMMENT '最近天數,1:最近1天,7:最近7天,30:最近30天',
`channel` STRING COMMENT '渠道',
`uv_count` BIGINT COMMENT '日活(訪問人數)',
`duration_sec` BIGINT COMMENT '頁面停留總時長',
`avg_duration_sec` BIGINT COMMENT '一次會話,頁面停留平均時長,單位為描述',
`page_count` BIGINT COMMENT '頁面總瀏覽數',
`avg_page_count` BIGINT COMMENT '一次會話,頁面平均瀏覽數',
`sv_count` BIGINT COMMENT '會話次數',
`bounce_count` BIGINT COMMENT '跳出數',
`bounce_rate` DECIMAL(16,2) COMMENT '跳出率'
) COMMENT '訪客統計'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_visit_stats/';
2.數據裝載
思路分析:該需求的關鍵點為會話的划分,總體實現思路可分為以下幾步:
第一步:對所有頁面訪問記錄進行會話的划分。
第二步:統計每個會話的瀏覽時長和瀏覽頁面數。
第三步:統計上述各指標。
insert overwrite table ads_visit_stats
select * from ads_visit_stats
union
select
'2020-06-14' dt,
is_new,
recent_days,
channel,
count(distinct(mid_id)) uv_count,
cast(sum(duration)/1000 as bigint) duration_sec,
cast(avg(duration)/1000 as bigint) avg_duration_sec,
sum(page_count) page_count,
cast(avg(page_count) as bigint) avg_page_count,
count(*) sv_count,
sum(if(page_count=1,1,0)) bounce_count,
cast(sum(if(page_count=1,1,0))/count(*)*100 as decimal(16,2)) bounce_rate
from
(
select
session_id,
mid_id,
is_new,
recent_days,
channel,
count(*) page_count,
sum(during_time) duration
from
(
select
mid_id,
channel,
recent_days,
is_new,
last_page_id,
page_id,
during_time,
concat(mid_id,'-',last_value(if(last_page_id is null,ts,null),true) over (partition by recent_days,mid_id order by ts)) session_id
from
(
select
mid_id,
channel,
last_page_id,
page_id,
during_time,
ts,
recent_days,
if(visit_date_first>=date_add('2020-06-14',-recent_days+1),'1','0') is_new
from
(
select
t1.mid_id,
t1.channel,
t1.last_page_id,
t1.page_id,
t1.during_time,
t1.dt,
t1.ts,
t2.visit_date_first
from
(
select
mid_id,
channel,
last_page_id,
page_id,
during_time,
dt,
ts
from dwd_page_log
where dt>=date_add('2020-06-14',-30)
)t1
left join
(
select
mid_id,
visit_date_first
from dwt_visitor_topic
where dt='2020-06-14'
)t2
on t1.mid_id=t2.mid_id
)t3 lateral view explode(Array(1,7,30)) tmp as recent_days
where dt>=date_add('2020-06-14',-recent_days+1)
)t4
)t5
group by session_id,mid_id,is_new,recent_days,channel
)t6
group by is_new,recent_days,channel;
6.1.2 路徑分析
用戶路徑分析,顧名思義,就是指用戶在APP或網站中的訪問路徑。為了衡量網站優化的效果或營銷推廣的效果,以及了解用戶行為偏好,時常要對訪問路徑進行分析。
用戶訪問路徑的可視化通常使用桑基圖。如下圖所示,該圖可真實還原用戶的訪問路徑,包括頁面跳轉和頁面訪問次序。
桑基圖需要我們提供每種頁面跳轉的次數,每個跳轉由source/target表示,source指跳轉起始頁面,target表示跳轉終到頁面。

1.建表語句
DROP TABLE IF EXISTS ads_page_path;
CREATE EXTERNAL TABLE ads_page_path
(
`dt` STRING COMMENT '統計日期',
`recent_days` BIGINT COMMENT '最近天數,1:最近1天,7:最近7天,30:最近30天',
`source` STRING COMMENT '跳轉起始頁面ID',
`target` STRING COMMENT '跳轉終到頁面ID',
`path_count` BIGINT COMMENT '跳轉次數'
) COMMENT '頁面瀏覽路徑'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_page_path/';
2.數據裝載
思路分析:該需求要統計的就是每種跳轉的次數,故理論上對source/target進行分組count()即可。統計時需注意以下兩點:
第一點:桑基圖的source不允許為空,但target可為空。
第二點:桑基圖所展示的流程不允許存在環。
insert overwrite table ads_page_path
select * from ads_page_path
union
select
'2020-06-14',
recent_days,
source,
target,
count(*)
from
(
select
recent_days,
concat('step-',step,':',source) source,
concat('step-',step+1,':',target) target
from
(
select
recent_days,
page_id source,
-- 窗口函數
lead(page_id,1,null) over (partition by recent_days,session_id order by ts) target,
row_number() over (partition by recent_days,session_id order by ts) step
from
(
select
recent_days,
last_page_id,
page_id,
ts,
concat(mid_id,'-',last_value(if(last_page_id is null,ts,null),true) over (partition by mid_id,recent_days order by ts)) session_id
from dwd_page_log lateral view explode(Array(1,7,30)) tmp as recent_days
where dt>=date_add('2020-06-14',-30)
and dt>=date_add('2020-06-14',-recent_days+1)
)t2
)t3
)t4
group by recent_days,source,target;
6.2 用戶主題
6.2.1 用戶統計
該需求為用戶綜合統計,其中包含若干指標,以下為對每個指標的解釋說明。
| 指標 | 說明 | 對應字段 |
|---|---|---|
| 新增用戶數 | 統計新增注冊用戶人數 | new_user_count |
| 新增下單用戶數 | 統計新增下單用戶人數 | new_order_user_count |
| 下單總金額 | 統計所有訂單總額 | order_final_amount |
| 下單用戶數 | 統計下單用戶總數 | order_user_count |
| 未下單用戶數 | 統計活躍但未下單用戶數 | no_order_user_count |
1.建表語句
DROP TABLE IF EXISTS ads_user_total;
CREATE EXTERNAL TABLE `ads_user_total` (
`dt` STRING COMMENT '統計日期',
`recent_days` BIGINT COMMENT '最近天數,0:累積值,1:最近1天,7:最近7天,30:最近30天',
`new_user_count` BIGINT COMMENT '新注冊用戶數',
`new_order_user_count` BIGINT COMMENT '新增下單用戶數',
`order_final_amount` DECIMAL(16,2) COMMENT '下單總金額',
`order_user_count` BIGINT COMMENT '下單用戶數',
`no_order_user_count` BIGINT COMMENT '未下單用戶數(具體指活躍用戶中未下單用戶)'
) COMMENT '用戶統計'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_user_total/';
2.數據裝載
insert overwrite table ads_user_total
select * from ads_user_total
union
select
'2020-06-14',
recent_days,
sum(if(login_date_first>=recent_days_ago,1,0)) new_user_count,
sum(if(order_date_first>=recent_days_ago,1,0)) new_order_user_count,
sum(order_final_amount) order_final_amount,
sum(if(order_final_amount>0,1,0)) order_user_count,
sum(if(login_date_last>=recent_days_ago and order_final_amount=0,1,0)) no_order_user_count
from
(
select
recent_days,
user_id,
login_date_first,
login_date_last,
order_date_first,
case when recent_days=0 then order_final_amount
when recent_days=1 then order_last_1d_final_amount
when recent_days=7 then order_last_7d_final_amount
when recent_days=30 then order_last_30d_final_amount
end order_final_amount,
if(recent_days=0,'1970-01-01',date_add('2020-06-14',-recent_days+1)) recent_days_ago
from dwt_user_topic lateral view explode(Array(0,1,7,30)) tmp as recent_days
where dt='2020-06-14'
)t1
group by recent_days;
6.2.2 用戶變動統計
該需求包括兩個指標,分別為流失用戶數和回流用戶數,以下為對兩個指標的解釋說明。
| 指標 | 說明 | 對應字段 |
|---|---|---|
| 流失用戶數 | 之前活躍過的用戶,最近一段時間未活躍,就稱為流失用戶。此處要求統計7日前(只包含7日前當天)活躍,但最近7日未活躍的用戶總數。 | user_churn_count |
| 回流用戶數 | 之前的活躍用戶,一段時間未活躍(流失),今日又活躍了,就稱為回流用戶。此處要求統計回流用戶總數。 | new_order_user_count |
1.建表語句
DROP TABLE IF EXISTS ads_user_change;
CREATE EXTERNAL TABLE `ads_user_change` (
`dt` STRING COMMENT '統計日期',
`user_churn_count` BIGINT COMMENT '流失用戶數',
`user_back_count` BIGINT COMMENT '回流用戶數'
) COMMENT '用戶變動統計'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_user_change/';
2.數據裝載
思路分析:
-
流失用戶:末次活躍時間為7日前的用戶即為流失用戶。
-
回流用戶:末次活躍時間為今日,上次活躍時間在8日前的用戶即為回流用戶。
insert overwrite table ads_user_change
select * from ads_user_change
union
select
churn.dt,
user_churn_count,
user_back_count
from
(
select
'2020-06-14' dt,
count(*) user_churn_count
from dwt_user_topic
where dt='2020-06-14'
and login_date_last=date_add('2020-06-14',-7)
)churn
join
(
select
'2020-06-14' dt,
count(*) user_back_count
from
(
select
user_id,
login_date_last
from dwt_user_topic
where dt='2020-06-14'
and login_date_last='2020-06-14'
)t1
join
(
select
user_id,
login_date_last login_date_previous
from dwt_user_topic
where dt=date_add('2020-06-14',-1)
)t2
on t1.user_id=t2.user_id
where datediff(login_date_last,login_date_previous)>=8
)back
on churn.dt=back.dt;
6.2.3 用戶行為漏斗分析
漏斗分析是一個數據分析模型,它能夠科學反映一個業務過程從起點到終點各階段用戶轉化情況。由於其能將各階段環節都展示出來,故哪個階段存在問題,就能一目了然。
用戶行為漏斗分析也稱為轉化率,具體求何種轉化率視具體需求而定,比如,消費用戶轉化率指的是單日日活中最終有多少用戶下單消費,即消費用戶轉化率=單日消費用戶數/日活數。

該需求要求統計一個完整的購物流程各個階段的人數。
1.建表語句
DROP TABLE IF EXISTS ads_user_action;
CREATE EXTERNAL TABLE `ads_user_action` (
`dt` STRING COMMENT '統計日期',
`recent_days` BIGINT COMMENT '最近天數,1:最近1天,7:最近7天,30:最近30天',
`home_count` BIGINT COMMENT '瀏覽首頁人數',
`good_detail_count` BIGINT COMMENT '瀏覽商品詳情頁人數',
`cart_count` BIGINT COMMENT '加入購物車人數',
`order_count` BIGINT COMMENT '下單人數',
`payment_count` BIGINT COMMENT '支付人數'
) COMMENT '漏斗分析'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_user_action/';
2.數據裝載
with
tmp_page as
(
select
'2020-06-14' dt,
recent_days,
sum(if(array_contains(pages,'home'),1,0)) home_count,
sum(if(array_contains(pages,'good_detail'),1,0)) good_detail_count
from
(
select
recent_days,
mid_id,
collect_set(page_id) pages
from
(
select
dt,
mid_id,
page.page_id
from dws_visitor_action_daycount lateral view explode(page_stats) tmp as page
where dt>=date_add('2020-06-14',-29)
and page.page_id in('home','good_detail')
)t1 lateral view explode(Array(1,7,30)) tmp as recent_days
where dt>=date_add('2020-06-14',-recent_days+1)
group by recent_days,mid_id
)t2
group by recent_days
),
tmp_cop as
(
select
'2020-06-14' dt,
recent_days,
sum(if(cart_count>0,1,0)) cart_count,
sum(if(order_count>0,1,0)) order_count,
sum(if(payment_count>0,1,0)) payment_count
from
(
select
recent_days,
user_id,
case
when recent_days=1 then cart_last_1d_count
when recent_days=7 then cart_last_7d_count
when recent_days=30 then cart_last_30d_count
end cart_count,
case
when recent_days=1 then order_last_1d_count
when recent_days=7 then order_last_7d_count
when recent_days=30 then order_last_30d_count
end order_count,
case
when recent_days=1 then payment_last_1d_count
when recent_days=7 then payment_last_7d_count
when recent_days=30 then payment_last_30d_count
end payment_count
from dwt_user_topic lateral view explode(Array(1,7,30)) tmp as recent_days
where dt='2020-06-14'
)t1
group by recent_days
)
insert overwrite table ads_user_action
select * from ads_user_action
union
select
tmp_page.dt,
tmp_page.recent_days,
home_count,
good_detail_count,
cart_count,
order_count,
payment_count
from tmp_page
join tmp_cop
on tmp_page.recent_days=tmp_cop.recent_days;
6.2.4 用戶留存率
留存分析一般包含新增留存和活躍留存分析。
新增留存分析是分析某天的新增用戶中,有多少人有后續的活躍行為。活躍留存分析是分析某天的活躍用戶中,有多少人有后續的活躍行為。
留存分析是衡量產品對用戶價值高低的重要指標。
此處要求統計新增留存率,新增留存率具體是指留存用戶數與新增用戶數的比值,例如2020-06-14新增100個用戶,1日之后(2020-06-15)這100人中有80個人活躍了,那2020-06-14的1日留存數則為80,2020-06-14的1日留存率則為80%。
要求統計每天的1至7日留存率,如下圖所示。

1.建表語句
DROP TABLE IF EXISTS ads_user_retention;
CREATE EXTERNAL TABLE ads_user_retention (
`dt` STRING COMMENT '統計日期',
`create_date` STRING COMMENT '用戶新增日期',
`retention_day` BIGINT COMMENT '截至當前日期留存天數',
`retention_count` BIGINT COMMENT '留存用戶數量',
`new_user_count` BIGINT COMMENT '新增用戶數量',
`retention_rate` DECIMAL(16,2) COMMENT '留存率'
) COMMENT '用戶留存率'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_user_retention/';
2.數據裝載
insert overwrite table ads_user_retention
select * from ads_user_retention
union
select
'2020-06-14', -- 統計日期
login_date_first create_date, -- 用戶新增日期
datediff('2020-06-14',login_date_first) retention_day,
sum(if(login_date_last='2020-06-14',1,0)) retention_count,
count(*) new_user_count,
cast(sum(if(login_date_last='2020-06-14',1,0))/count(*)*100 as decimal(16,2)) retention_rate
from dwt_user_topic
where dt='2020-06-14'
and login_date_first>=date_add('2020-06-14',-7)
and login_date_first<'2020-06-14'
group by login_date_first;
6.3 商品主題
6.3.1 商品統計
該指標為商品綜合統計,包含每個spu被下單總次數和被下單總金額。
1.建表語句
DROP TABLE IF EXISTS ads_order_spu_stats;
CREATE EXTERNAL TABLE `ads_order_spu_stats` (
`dt` STRING COMMENT '統計日期',
`recent_days` BIGINT COMMENT '最近天數,1:最近1天,7:最近7天,30:最近30天',
`spu_id` STRING COMMENT '商品ID',
`spu_name` STRING COMMENT '商品名稱',
`tm_id` STRING COMMENT '品牌ID',
`tm_name` STRING COMMENT '品牌名稱',
`category3_id` STRING COMMENT '三級品類ID',
`category3_name` STRING COMMENT '三級品類名稱',
`category2_id` STRING COMMENT '二級品類ID',
`category2_name` STRING COMMENT '二級品類名稱',
`category1_id` STRING COMMENT '一級品類ID',
`category1_name` STRING COMMENT '一級品類名稱',
`order_count` BIGINT COMMENT '訂單數',
`order_amount` DECIMAL(16,2) COMMENT '訂單金額'
) COMMENT '商品銷售統計'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_order_spu_stats/';
2.數據裝載
insert overwrite table ads_order_spu_stats
select * from ads_order_spu_stats
union
select
'2020-06-14' dt,
recent_days,
spu_id,
spu_name,
tm_id,
tm_name,
category3_id,
category3_name,
category2_id,
category2_name,
category1_id,
category1_name,
sum(order_count),
sum(order_amount)
from
(
select
recent_days,
sku_id,
case
when recent_days=1 then order_last_1d_count
when recent_days=7 then order_last_7d_count
when recent_days=30 then order_last_30d_count
end order_count,
case
when recent_days=1 then order_last_1d_final_amount
when recent_days=7 then order_last_7d_final_amount
when recent_days=30 then order_last_30d_final_amount
end order_amount
from dwt_sku_topic lateral view explode(Array(1,7,30)) tmp as recent_days
where dt='2020-06-14'
)t1
left join
(
select
id,
spu_id,
spu_name,
tm_id,
tm_name,
category3_id,
category3_name,
category2_id,
category2_name,
category1_id,
category1_name
from dim_sku_info
where dt='2020-06-14'
)t2
on t1.sku_id=t2.id
group by recent_days,spu_id,spu_name,tm_id,tm_name,category3_id,category3_name,category2_id,category2_name,category1_id,category1_name;
6.3.2 品牌復購率
品牌復購率是指一段時間內重復購買某品牌的人數與購買過該品牌的人數的比值。重復購買即購買次數大於等於2,購買過即購買次數大於1。
此處要求統計最近1,7,30天的各品牌復購率。
1.建表語句
DROP TABLE IF EXISTS ads_repeat_purchase;
CREATE EXTERNAL TABLE `ads_repeat_purchase` (
`dt` STRING COMMENT '統計日期',
`recent_days` BIGINT COMMENT '最近天數,1:最近1天,7:最近7天,30:最近30天',
`tm_id` STRING COMMENT '品牌ID',
`tm_name` STRING COMMENT '品牌名稱',
`order_repeat_rate` DECIMAL(16,2) COMMENT '復購率'
) COMMENT '品牌復購率'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_repeat_purchase/';
2.數據裝載
思路分析:該需求可分兩步實現:
第一步:統計每個用戶購買每個品牌的次數。
第二步:分別統計購買次數大於1的人數和大於2的人數。
insert overwrite table ads_repeat_purchase
select * from ads_repeat_purchase
union
select
'2020-06-14' dt,
recent_days,
tm_id,
tm_name,
cast(sum(if(order_count>=2,1,0))/sum(if(order_count>=1,1,0))*100 as decimal(16,2))
from
(
select
recent_days,
user_id,
tm_id,
tm_name,
sum(order_count) order_count
from
(
select
recent_days,
user_id,
sku_id,
count(*) order_count
from dwd_order_detail lateral view explode(Array(1,7,30)) tmp as recent_days
where dt>=date_add('2020-06-14',-29)
and dt>=date_add('2020-06-14',-recent_days+1)
group by recent_days, user_id,sku_id
)t1
left join
(
select
id,
tm_id,
tm_name
from dim_sku_info
where dt='2020-06-14'
)t2
on t1.sku_id=t2.id
group by recent_days,user_id,tm_id,tm_name
)t3
group by recent_days,tm_id,tm_name;
6.4 訂單主題
6.4.1 訂單統計
該需求包含訂單總數,訂單總金額和下單總人數。
1.建表語句
DROP TABLE IF EXISTS ads_order_total;
CREATE EXTERNAL TABLE `ads_order_total` (
`dt` STRING COMMENT '統計日期',
`recent_days` BIGINT COMMENT '最近天數,1:最近1天,7:最近7天,30:最近30天',
`order_count` BIGINT COMMENT '訂單數',
`order_amount` DECIMAL(16,2) COMMENT '訂單金額',
`order_user_count` BIGINT COMMENT '下單人數'
) COMMENT '訂單統計'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_order_total/';
2.數據裝載
insert overwrite table ads_order_total
select * from ads_order_total
union
select
'2020-06-14',
recent_days,
sum(order_count),
sum(order_final_amount) order_final_amount,
sum(if(order_final_amount>0,1,0)) order_user_count
from
(
select
recent_days,
user_id,
case when recent_days=0 then order_count
when recent_days=1 then order_last_1d_count
when recent_days=7 then order_last_7d_count
when recent_days=30 then order_last_30d_count
end order_count,
case when recent_days=0 then order_final_amount
when recent_days=1 then order_last_1d_final_amount
when recent_days=7 then order_last_7d_final_amount
when recent_days=30 then order_last_30d_final_amount
end order_final_amount
from dwt_user_topic lateral view explode(Array(1,7,30)) tmp as recent_days
where dt='2020-06-14'
)t1
group by recent_days;
6.4.2 各地區訂單統計
該需求包含各省份訂單總數和訂單總金額。
1.建表語句
DROP TABLE IF EXISTS ads_order_by_province;
CREATE EXTERNAL TABLE `ads_order_by_province` (
`dt` STRING COMMENT '統計日期',
`recent_days` BIGINT COMMENT '最近天數,1:最近1天,7:最近7天,30:最近30天',
`province_id` STRING COMMENT '省份ID',
`province_name` STRING COMMENT '省份名稱',
`area_code` STRING COMMENT '地區編碼',
`iso_code` STRING COMMENT '國際標准地區編碼',
`iso_code_3166_2` STRING COMMENT '國際標准地區編碼',
`order_count` BIGINT COMMENT '訂單數',
`order_amount` DECIMAL(16,2) COMMENT '訂單金額'
) COMMENT '各地區訂單統計'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_order_by_province/';
2.數據裝載
insert overwrite table ads_order_by_province
select * from ads_order_by_province
union
select
dt,
recent_days,
province_id,
province_name,
area_code,
iso_code,
iso_3166_2,
order_count,
order_amount
from
(
select
'2020-06-14' dt,
recent_days,
province_id,
sum(order_count) order_count,
sum(order_amount) order_amount
from
(
select
recent_days,
province_id,
case
when recent_days=1 then order_last_1d_count
when recent_days=7 then order_last_7d_count
when recent_days=30 then order_last_30d_count
end order_count,
case
when recent_days=1 then order_last_1d_final_amount
when recent_days=7 then order_last_7d_final_amount
when recent_days=30 then order_last_30d_final_amount
end order_amount
from dwt_area_topic lateral view explode(Array(1,7,30)) tmp as recent_days
where dt='2020-06-14'
)t1
group by recent_days,province_id
)t2
join dim_base_province t3
on t2.province_id=t3.id;
6.5 優惠券主題
6.5.1 優惠券統計
該需求要求統計最近30日發布的所有優惠券的領用情況和補貼率,補貼率是指,優惠金額與使用優惠券的訂單的原價金額的比值。
1.建表語句
DROP TABLE IF EXISTS ads_coupon_stats;
CREATE EXTERNAL TABLE ads_coupon_stats (
`dt` STRING COMMENT '統計日期',
`coupon_id` STRING COMMENT '優惠券ID',
`coupon_name` STRING COMMENT '優惠券名稱',
`start_date` STRING COMMENT '發布日期',
`rule_name` STRING COMMENT '優惠規則,例如滿100元減10元',
`get_count` BIGINT COMMENT '領取次數',
`order_count` BIGINT COMMENT '使用(下單)次數',
`expire_count` BIGINT COMMENT '過期次數',
`order_original_amount` DECIMAL(16,2) COMMENT '使用優惠券訂單原始金額',
`order_final_amount` DECIMAL(16,2) COMMENT '使用優惠券訂單最終金額',
`reduce_amount` DECIMAL(16,2) COMMENT '優惠金額',
`reduce_rate` DECIMAL(16,2) COMMENT '補貼率'
) COMMENT '商品銷售統計'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_coupon_stats/';
2.數據裝載
insert overwrite table ads_coupon_stats
select * from ads_coupon_stats
union
select
'2020-06-14' dt,
t1.id,
coupon_name,
start_date,
rule_name,
get_count,
order_count,
expire_count,
order_original_amount,
order_final_amount,
reduce_amount,
reduce_rate
from
(
select
id,
coupon_name,
date_format(start_time,'yyyy-MM-dd') start_date,
case
when coupon_type='3201' then concat('滿',condition_amount,'元減',benefit_amount,'元')
when coupon_type='3202' then concat('滿',condition_num,'件打', (1-benefit_discount)*10,'折')
when coupon_type='3203' then concat('減',benefit_amount,'元')
end rule_name
from dim_coupon_info
where dt='2020-06-14'
and date_format(start_time,'yyyy-MM-dd')>=date_add('2020-06-14',-29)
)t1
left join
(
select
coupon_id,
get_count,
order_count,
expire_count,
order_original_amount,
order_final_amount,
order_reduce_amount reduce_amount,
cast(order_reduce_amount/order_original_amount as decimal(16,2)) reduce_rate
from dwt_coupon_topic
where dt='2020-06-14'
)t2
on t1.id=t2.coupon_id;
6.6 活動主題
6.6.1 活動統計
該需求要求統計最近30日發布的所有活動的參與情況和補貼率,補貼率是指,優惠金額與參與活動的訂單原價金額的比值。
1.建表語句
DROP TABLE IF EXISTS ads_activity_stats;
CREATE EXTERNAL TABLE `ads_activity_stats` (
`dt` STRING COMMENT '統計日期',
`activity_id` STRING COMMENT '活動ID',
`activity_name` STRING COMMENT '活動名稱',
`start_date` STRING COMMENT '活動開始日期',
`order_count` BIGINT COMMENT '參與活動訂單數',
`order_original_amount` DECIMAL(16,2) COMMENT '參與活動訂單原始金額',
`order_final_amount` DECIMAL(16,2) COMMENT '參與活動訂單最終金額',
`reduce_amount` DECIMAL(16,2) COMMENT '優惠金額',
`reduce_rate` DECIMAL(16,2) COMMENT '補貼率'
) COMMENT '商品銷售統計'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/warehouse/gmall/ads/ads_activity_stats/';
2.數據裝載
insert overwrite table ads_activity_stats
select * from ads_activity_stats
union
select
'2020-06-14' dt,
t4.activity_id,
activity_name,
start_date,
order_count,
order_original_amount,
order_final_amount,
reduce_amount,
reduce_rate
from
(
select
activity_id,
activity_name,
date_format(start_time,'yyyy-MM-dd') start_date
from dim_activity_rule_info
where dt='2020-06-14'
and date_format(start_time,'yyyy-MM-dd')>=date_add('2020-06-14',-29)
group by activity_id,activity_name,start_time
)t4
left join
(
select
activity_id,
sum(order_count) order_count,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
sum(order_reduce_amount) reduce_amount,
cast(sum(order_reduce_amount)/sum(order_original_amount)*100 as decimal(16,2)) reduce_rate
from dwt_activity_topic
where dt='2020-06-14'
group by activity_id
)t5
on t4.activity_id=t5.activity_id;
6.7 ADS層業務數據導入腳本
1)編寫腳本
(1)在/home/atguigu/bin目錄下創建腳本dwt_to_ads.sh
[atguigu@hadoop102 bin]$ vim dwt_to_ads.sh
在腳本中填寫如下內容
#!/bin/bash
APP=gmall
# 如果是輸入的日期按照取輸入日期;如果沒輸入日期取當前時間的前一天
if [ -n "$2" ] ;then
do_date=$2
else
do_date=`date -d "-1 day" +%F`
fi
ads_activity_stats="
insert overwrite table ${APP}.ads_activity_stats
select * from ${APP}.ads_activity_stats
union
select
'$do_date' dt,
t4.activity_id,
activity_name,
start_date,
order_count,
order_original_amount,
order_final_amount,
reduce_amount,
reduce_rate
from
(
select
activity_id,
activity_name,
date_format(start_time,'yyyy-MM-dd') start_date
from ${APP}.dim_activity_rule_info
where dt='$do_date'
and date_format(start_time,'yyyy-MM-dd')>=date_add('$do_date',-29)
group by activity_id,activity_name,start_time
)t4
left join
(
select
activity_id,
sum(order_count) order_count,
sum(order_original_amount) order_original_amount,
sum(order_final_amount) order_final_amount,
sum(order_reduce_amount) reduce_amount,
cast(sum(order_reduce_amount)/sum(order_original_amount)*100 as decimal(16,2)) reduce_rate
from ${APP}.dwt_activity_topic
where dt='$do_date'
group by activity_id
)t5
on t4.activity_id=t5.activity_id;
"
ads_coupon_stats="
insert overwrite table ${APP}.ads_coupon_stats
select * from ${APP}.ads_coupon_stats
union
select
'$do_date' dt,
t1.id,
coupon_name,
start_date,
rule_name,
get_count,
order_count,
expire_count,
order_original_amount,
order_final_amount,
reduce_amount,
reduce_rate
from
(
select
id,
coupon_name,
date_format(start_time,'yyyy-MM-dd') start_date,
case
when coupon_type='3201' then concat('滿',condition_amount,'元減',benefit_amount,'元')
when coupon_type='3202' then concat('滿',condition_num,'件打', (1-benefit_discount)*10,'折')
when coupon_type='3203' then concat('減',benefit_amount,'元')
end rule_name
from ${APP}.dim_coupon_info
where dt='$do_date'
and date_format(start_time,'yyyy-MM-dd')>=date_add('$do_date',-29)
)t1
left join
(
select
coupon_id,
get_count,
order_count,
expire_count,
order_original_amount,
order_final_amount,
order_reduce_amount reduce_amount,
cast(order_reduce_amount/order_original_amount as decimal(16,2)) reduce_rate
from ${APP}.dwt_coupon_topic
where dt='$do_date'
)t2
on t1.id=t2.coupon_id;
"
ads_order_by_province="
insert overwrite table ${APP}.ads_order_by_province
select * from ${APP}.ads_order_by_province
union
select
dt,
recent_days,
province_id,
province_name,
area_code,
iso_code,
iso_3166_2,
order_count,
order_amount
from
(
select
'$do_date' dt,
recent_days,
province_id,
sum(order_count) order_count,
sum(order_amount) order_amount
from
(
select
recent_days,
province_id,
case
when recent_days=1 then order_last_1d_count
when recent_days=7 then order_last_7d_count
when recent_days=30 then order_last_30d_count
end order_count,
case
when recent_days=1 then order_last_1d_final_amount
when recent_days=7 then order_last_7d_final_amount
when recent_days=30 then order_last_30d_final_amount
end order_amount
from ${APP}.dwt_area_topic lateral view explode(Array(1,7,30)) tmp as recent_days
where dt='$do_date'
)t1
group by recent_days,province_id
)t2
join ${APP}.dim_base_province t3
on t2.province_id=t3.id;
"
ads_order_spu_stats="
insert overwrite table ${APP}.ads_order_spu_stats
select * from ${APP}.ads_order_spu_stats
union
select
'$do_date' dt,
recent_days,
spu_id,
spu_name,
tm_id,
tm_name,
category3_id,
category3_name,
category2_id,
category2_name,
category1_id,
category1_name,
sum(order_count),
sum(order_amount)
from
(
select
recent_days,
sku_id,
case
when recent_days=1 then order_last_1d_count
when recent_days=7 then order_last_7d_count
when recent_days=30 then order_last_30d_count
end order_count,
case
when recent_days=1 then order_last_1d_final_amount
when recent_days=7 then order_last_7d_final_amount
when recent_days=30 then order_last_30d_final_amount
end order_amount
from ${APP}.dwt_sku_topic lateral view explode(Array(1,7,30)) tmp as recent_days
where dt='$do_date'
)t1
left join
(
select
id,
spu_id,
spu_name,
tm_id,
tm_name,
category3_id,
category3_name,
category2_id,
category2_name,
category1_id,
category1_name
from ${APP}.dim_sku_info
where dt='$do_date'
)t2
on t1.sku_id=t2.id
group by recent_days,spu_id,spu_name,tm_id,tm_name,category3_id,category3_name,category2_id,category2_name,category1_id,category1_name;
"
ads_order_total="
insert overwrite table ${APP}.ads_order_total
select * from ${APP}.ads_order_total
union
select
'$do_date',
recent_days,
sum(order_count),
sum(order_final_amount) order_final_amount,
sum(if(order_final_amount>0,1,0)) order_user_count
from
(
select
recent_days,
user_id,
case when recent_days=0 then order_count
when recent_days=1 then order_last_1d_count
when recent_days=7 then order_last_7d_count
when recent_days=30 then order_last_30d_count
end order_count,
case when recent_days=0 then order_final_amount
when recent_days=1 then order_last_1d_final_amount
when recent_days=7 then order_last_7d_final_amount
when recent_days=30 then order_last_30d_final_amount
end order_final_amount
from ${APP}.dwt_user_topic lateral view explode(Array(1,7,30)) tmp as recent_days
where dt='$do_date'
)t1
group by recent_days;
"
ads_page_path="
insert overwrite table ${APP}.ads_page_path
select * from ${APP}.ads_page_path
union
select
'$do_date',
recent_days,
source,
target,
count(*)
from
(
select
recent_days,
concat('step-',step,':',source) source,
concat('step-',step+1,':',target) target
from
(
select
recent_days,
page_id source,
lead(page_id,1,null) over (partition by recent_days,session_id order by ts) target,
row_number() over (partition by recent_days,session_id order by ts) step
from
(
select
recent_days,
last_page_id,
page_id,
ts,
concat(mid_id,'-',last_value(if(last_page_id is null,ts,null),true) over (partition by mid_id,recent_days order by ts)) session_id
from ${APP}.dwd_page_log lateral view explode(Array(1,7,30)) tmp as recent_days
where dt>=date_add('$do_date',-30)
and dt>=date_add('$do_date',-recent_days+1)
)t2
)t3
)t4
group by recent_days,source,target;
"
ads_repeat_purchase="
insert overwrite table ${APP}.ads_repeat_purchase
select * from ${APP}.ads_repeat_purchase
union
select
'$do_date' dt,
recent_days,
tm_id,
tm_name,
cast(sum(if(order_count>=2,1,0))/sum(if(order_count>=1,1,0))*100 as decimal(16,2))
from
(
select
recent_days,
user_id,
tm_id,
tm_name,
sum(order_count) order_count
from
(
select
recent_days,
user_id,
sku_id,
count(*) order_count
from ${APP}.dwd_order_detail lateral view explode(Array(1,7,30)) tmp as recent_days
where dt>=date_add('$do_date',-29)
and dt>=date_add('$do_date',-recent_days+1)
group by recent_days, user_id,sku_id
)t1
left join
(
select
id,
tm_id,
tm_name
from ${APP}.dim_sku_info
where dt='$do_date'
)t2
on t1.sku_id=t2.id
group by recent_days,user_id,tm_id,tm_name
)t3
group by recent_days,tm_id,tm_name;
"
ads_user_action="
with
tmp_page as
(
select
'$do_date' dt,
recent_days,
sum(if(array_contains(pages,'home'),1,0)) home_count,
sum(if(array_contains(pages,'good_detail'),1,0)) good_detail_count
from
(
select
recent_days,
mid_id,
collect_set(page_id) pages
from
(
select
dt,
mid_id,
page.page_id
from ${APP}.dws_visitor_action_daycount lateral view explode(page_stats) tmp as page
where dt>=date_add('$do_date',-29)
and page.page_id in('home','good_detail')
)t1 lateral view explode(Array(1,7,30)) tmp as recent_days
where dt>=date_add('$do_date',-recent_days+1)
group by recent_days,mid_id
)t2
group by recent_days
),
tmp_cop as
(
select
'$do_date' dt,
recent_days,
sum(if(cart_count>0,1,0)) cart_count,
sum(if(order_count>0,1,0)) order_count,
sum(if(payment_count>0,1,0)) payment_count
from
(
select
recent_days,
user_id,
case
when recent_days=1 then cart_last_1d_count
when recent_days=7 then cart_last_7d_count
when recent_days=30 then cart_last_30d_count
end cart_count,
case
when recent_days=1 then order_last_1d_count
when recent_days=7 then order_last_7d_count
when recent_days=30 then order_last_30d_count
end order_count,
case
when recent_days=1 then payment_last_1d_count
when recent_days=7 then payment_last_7d_count
when recent_days=30 then payment_last_30d_count
end payment_count
from ${APP}.dwt_user_topic lateral view explode(Array(1,7,30)) tmp as recent_days
where dt='$do_date'
)t1
group by recent_days
)
insert overwrite table ${APP}.ads_user_action
select * from ${APP}.ads_user_action
union
select
tmp_page.dt,
tmp_page.recent_days,
home_count,
good_detail_count,
cart_count,
order_count,
payment_count
from tmp_page
join tmp_cop
on tmp_page.recent_days=tmp_cop.recent_days;
"
ads_user_change="
insert overwrite table ${APP}.ads_user_change
select * from ${APP}.ads_user_change
union
select
churn.dt,
user_churn_count,
user_back_count
from
(
select
'$do_date' dt,
count(*) user_churn_count
from ${APP}.dwt_user_topic
where dt='$do_date'
and login_date_last=date_add('$do_date',-7)
)churn
join
(
select
'$do_date' dt,
count(*) user_back_count
from
(
select
user_id,
login_date_last
from ${APP}.dwt_user_topic
where dt='$do_date'
and login_date_last='$do_date'
)t1
join
(
select
user_id,
login_date_last login_date_previous
from ${APP}.dwt_user_topic
where dt=date_add('$do_date',-1)
)t2
on t1.user_id=t2.user_id
where datediff(login_date_last,login_date_previous)>=8
)back
on churn.dt=back.dt;
"
ads_user_retention="
insert overwrite table ${APP}.ads_user_retention
select * from ${APP}.ads_user_retention
union
select
'$do_date',
login_date_first create_date,
datediff('$do_date',login_date_first) retention_day,
sum(if(login_date_last='$do_date',1,0)) retention_count,
count(*) new_user_count,
cast(sum(if(login_date_last='$do_date',1,0))/count(*)*100 as decimal(16,2)) retention_rate
from ${APP}.dwt_user_topic
where dt='$do_date'
and login_date_first>=date_add('$do_date',-7)
and login_date_first<'$do_date'
group by login_date_first;
"
ads_user_total="
insert overwrite table ${APP}.ads_user_total
select * from ${APP}.ads_user_total
union
select
'$do_date',
recent_days,
sum(if(login_date_first>=recent_days_ago,1,0)) new_user_count,
sum(if(order_date_first>=recent_days_ago,1,0)) new_order_user_count,
sum(order_final_amount) order_final_amount,
sum(if(order_final_amount>0,1,0)) order_user_count,
sum(if(login_date_last>=recent_days_ago and order_final_amount=0,1,0)) no_order_user_count
from
(
select
recent_days,
user_id,
login_date_first,
login_date_last,
order_date_first,
case when recent_days=0 then order_final_amount
when recent_days=1 then order_last_1d_final_amount
when recent_days=7 then order_last_7d_final_amount
when recent_days=30 then order_last_30d_final_amount
end order_final_amount,
if(recent_days=0,'1970-01-01',date_add('$do_date',-recent_days+1)) recent_days_ago
from ${APP}.dwt_user_topic lateral view explode(Array(0,1,7,30)) tmp as recent_days
where dt='$do_date'
)t1
group by recent_days;
"
ads_visit_stats="
insert overwrite table ${APP}.ads_visit_stats
select * from ${APP}.ads_visit_stats
union
select
'$do_date' dt,
is_new,
recent_days,
channel,
count(distinct(mid_id)) uv_count,
cast(sum(duration)/1000 as bigint) duration_sec,
cast(avg(duration)/1000 as bigint) avg_duration_sec,
sum(page_count) page_count,
cast(avg(page_count) as bigint) avg_page_count,
count(*) sv_count,
sum(if(page_count=1,1,0)) bounce_count,
cast(sum(if(page_count=1,1,0))/count(*)*100 as decimal(16,2)) bounce_rate
from
(
select
session_id,
mid_id,
is_new,
recent_days,
channel,
count(*) page_count,
sum(during_time) duration
from
(
select
mid_id,
channel,
recent_days,
is_new,
last_page_id,
page_id,
during_time,
concat(mid_id,'-',last_value(if(last_page_id is null,ts,null),true) over (partition by recent_days,mid_id order by ts)) session_id
from
(
select
mid_id,
channel,
last_page_id,
page_id,
during_time,
ts,
recent_days,
if(visit_date_first>=date_add('$do_date',-recent_days+1),'1','0') is_new
from
(
select
t1.mid_id,
t1.channel,
t1.last_page_id,
t1.page_id,
t1.during_time,
t1.dt,
t1.ts,
t2.visit_date_first
from
(
select
mid_id,
channel,
last_page_id,
page_id,
during_time,
dt,
ts
from ${APP}.dwd_page_log
where dt>=date_add('$do_date',-30)
)t1
left join
(
select
mid_id,
visit_date_first
from ${APP}.dwt_visitor_topic
where dt='$do_date'
)t2
on t1.mid_id=t2.mid_id
)t3 lateral view explode(Array(1,7,30)) tmp as recent_days
where dt>=date_add('$do_date',-recent_days+1)
)t4
)t5
group by session_id,mid_id,is_new,recent_days,channel
)t6
group by is_new,recent_days,channel;
"
case $1 in
"ads_activity_stats" )
hive -e "$ads_activity_stats"
;;
"ads_coupon_stats" )
hive -e "$ads_coupon_stats"
;;
"ads_order_by_province" )
hive -e "$ads_order_by_province"
;;
"ads_order_spu_stats" )
hive -e "$ads_order_spu_stats"
;;
"ads_order_total" )
hive -e "$ads_order_total"
;;
"ads_page_path" )
hive -e "$ads_page_path"
;;
"ads_repeat_purchase" )
hive -e "$ads_repeat_purchase"
;;
"ads_user_action" )
hive -e "$ads_user_action"
;;
"ads_user_change" )
hive -e "$ads_user_change"
;;
"ads_user_retention" )
hive -e "$ads_user_retention"
;;
"ads_user_total" )
hive -e "$ads_user_total"
;;
"ads_visit_stats" )
hive -e "$ads_visit_stats"
;;
"all" )
hive -e "$ads_activity_stats$ads_coupon_stats$ads_order_by_province$ads_order_spu_stats$ads_order_total$ads_page_path$ads_repeat_purchase$ads_user_action$ads_user_change$ads_user_retention$ads_user_total$ads_visit_stats"
;;
esac
(2)增加腳本執行權限
[atguigu@hadoop102 bin]$ chmod 777 dwt_to_ads.sh
2)腳本使用
(1)執行腳本
[atguigu@hadoop102 bin]$ dwt_to_ads.sh all 2020-06-14
(2)查看數據是否導入
