大數據
linux操作部分
1.創建用戶
-
進入root身份: su
-
輸入root密碼
-
創建新用戶: useradd zhang
-
給新用戶設置密碼: passwd zhang
-
給新用戶設置下次登陸時,更改密碼: chage -d 0 zhang
-
重啟命令: reboot創建群組
2.創建群組
- 進入root身份: su
- 輸入root密碼
- 新建群組: groupadd san
- 查看群組是否創建成功: tail -5 /etc/group
3.將用戶加到新建的群組中
-
將用戶加到群組里: usermod -G san zhang
-
查看用戶是否加入群組: tail -5 /etc/group
4.在“/home”中新建一個名為“share”的目錄,更改其所屬群組為san中的組;使“share”目錄滿足條件:san中的組內成員可以在“share”目錄中創建文件或目錄,刪除和修改自己創建的文件或目錄,但只能讀取別人創建的文件或目錄
-
創建目錄: mkdir /home/share
-
更改文件所屬組: chgrp san /home/share
chgrp 允許普通用戶改變文件所屬的組
-
改變文件屬性: chmod 1777 /home/share
5.使用zhang在“/home/zhang/”中新建一個名為“mytime.sh”的腳本文件,其功能為“獲取當前系統時間在屏幕上顯示,並將獲取到的時間保存到當前目錄中的mytime.txt中”。修改該腳本文件,使其成為可執行文件。修改環境變量“PATH”,把“mytime.sh”加入其中,並測試在任意路徑下執行“mytime”。
-
創建mytime.sh: vi mytime.sh
-
將下面的復制到mytime.sh里
#! /bin/bash DATE=$(date) if [ -e mytime.txt ];then echo "文件已存在!" else `touch mytime.txt` echo "文件已創建成功!" fi echo $DATE > mytime.txt
-
打開.bashrc: vi .bashrc
-
在里面寫上export PATH=$PATH:/home/zhang/
6.(1)在“/home/zhang/01/”中新建目錄“khdir”。將“mytime.sh”和“mytime.txt”文件復制到“khdir”中。將“khdir”目錄打包並壓縮,壓縮后文件名為“mytimes.tar.gz”,並放在“/home/zhang”目錄下。
- 創建文件夾khdir: mkdir -p /home/zhang/01/khdir
- 復制mytime.sh到目標目錄: cp mytime.sh /home/zhang/01/khdir
- 復制mytime.txt到目標目錄: cp mytime.txt /home/zhang/01/khdir
- 將“khdir”目錄打包並壓縮: tar -czvf mytimes.tar.gz /home/zhang/01/ khdir
- 把“mytimes.tar.gz”放在“/home/zhang”目錄下: mv /home/zhang/01/mytimes.tar.gz
7.為zhang授權,使其擁有root權限
-
查看root權限設置文件屬性: ll /etc/sudoers
-
給/etc/sudoers加可寫屬性: chmod u+w ll /etc/sudoers
-
打開/etc/sudoers文件並編輯: vi /etc/sudoers
找到root權限那一行叫root all… 下面也寫上這個,把上面的root改成你的用戶名
-
測試權限是否可用: sudo useradd usertest1
-
查看測試是否成功: tail -5 /etc/passwd
8.使用SSH將“eclipse-jee-2021-09-R-linux-gtk-x86_64.tar.gz”上傳到系統中,安裝到/usr/local目錄下,運行一次eclipse
-
使用ssh協議中的sftp上傳: sftp zhang@192.168.160.11
-
上傳文件: put E:/桌面/學習/eclipse-jee-2021-09-R-linux-gtk-x86_64.tar.gz /home/zhang
-
退出上傳: exit
-
解壓軟件到目錄: sudo tar -zxvf eclipse-jee-2021-09-R-linux-gtk-x86_64.tar.gz -C /usr/local
-
打開軟件目錄: cd /usr/local/eclipse
-
打開軟件: ./eclipse
大數據分析
標簽類型最多的前20
select tag ,count(*) num from bigdata_tags group by tag order by num desc limit 20;
In Netflix queue 131
atmospheric 36
superhero 24
thought-provoking 24
funny 23
Disney 23
surreal 23
religion 22
dark comedy 21
sci-fi 21
quirky 21
psychology 21
suspense 20
crime 19
twist ending 19
visually appealing 19
politics 18
mental illness 16
music 16
time travel 16
用戶評價星級的個數
select rating, count(*) num from bigdata_ratings group by rating order by num desc;
4 35369
3 33183
5 13211
2 13101
1 4602
0 1370
查詢每年用戶評價為五星,且電影類型為Adventure的數量
select year(r.rat_time) , count(*) num from bigdata_movies m join bigdata_ratings r on m.movieId=r.movieId where r.rating=5 and m.genres like concat('%','Adventure','%') group by year(r.rat_time) order by year(r.rat_time) desc;
2018 179
2017 257
2016 194
2015 158
2014 22
2013 44
2012 63
2011 34
2010 39
2009 54
2008 71
2007 70
2006 63
2005 74
2004 24
2003 70
2002 150
2001 131
2000 265
1999 107
1998 16
1997 109
1996 226
查詢電影網絡電影資料庫id大於50000且星級大於4並且評價標簽里含有“In Netflix queue”並且電影時間是1996年按電影名字分組排序
select m.title,count(*) num from bigdata_links l join bigdata_movies m on l.movieId=m.movieId join bigdata_ratings r on m.movieId=r.movieId join bigdata_tags t on m.movieId=t.movieId where l.imdbId > 50000 and r.rating>4 and t.tag like concat('%','In Netflix queue','%') and m.title like concat('%','1996','%') group by m.title o
rder by num desc;
Lone Star (1996) 8
Secrets & Lies (1996) 6
When We Were Kings (1996) 3
Kolya (Kolja) (1996) 2
Paradise Lost: The Child Murders at Robin Hood Hills (1996) 1
查詢評價標簽里含有“In Netflix queue”並且三個表中電影id都相同並且電影類型為Adventure按電影名字星級分組排序
select m.title, r.rating, count(*) num from bigdata_movies m join bigdata_ratings r on m.movieId = r.movieId join bigdata_tags t on r.userId=t.userId where t.tag like concat('%','In Netflix queue','%') and m.movieId=r.movieId and m.movieId = t.movieId and m.genres like concat('%','Adventure','%') group by m.title, r.rating order by num desc;
Tokyo Godfathers (2003) 4 1
Howl's Moving Castle (Hauru no ugoku shiro) (2004) 4 1
Porco Rosso (Crimson Pig) (Kurenai no buta) (1992) 3 1
Duma (2005) 3 1
查詢用戶id相同並且電影名字相同並且星級=5按照電影名字排序前20個
select m.title,count(*) num from bigdata_movies m join bigdata_ratings r on m.movieId=r.movieId join bigdata_tags t on r.userId=t.userId where r.userId = t.userId and r.movieId = t.movieId and r.rating = 5 group by m.title order by num desc limit 20 ;
Pulp Fiction (1994) 176
Fight Club (1999) 49
2001: A Space Odyssey (1968) 39
Léon: The Professional (a.k.a. The Professional) (Léon) (1994) 32
"Big Lebowski 31
Eternal Sunshine of the Spotless Mind (2004) 24
Eraserhead (1977) 16
Mary and Max (2009) 13
Inception (2010) 13
"Talented Mr. Ripley 12
Django Unchained (2012) 11
Battle Royale (Batoru rowaiaru) (2000) 10
Star Wars: Episode V - The Empire Strikes Back (1980) 10
"Lord of the Rings: The Return of the King 10
Margin Call (2011) 9
The Hateful Eight (2015) 9
"Sixth Sense 9
There Will Be Blood (2007) 8
"South Park: Bigger 8
In Bruges (2008) 8
查詢電影名字為Hercules (1997)用戶評星級的時間排序
select r.rat_time from bigdata_movies m join bigdata_ratings r on m.movieId=r.movieId where m.title='Hercules (1997)' order by r.rat_time desc;
2018-02-15
2017-12-26
2017-11-12
2017-05-02
2017-02-25
2016-10-15
2016-04-05
2015-09-10
2015-08-27
2015-07-04
2015-06-29
2015-05-19
2008-11-09
2008-11-01
2008-07-13
2007-11-25
2005-05-30
2005-04-22
2003-10-21
2003-05-27
2003-04-26
2002-09-28
2001-10-30
2001-01-03
2000-08-19
2000-08-08
2000-07-04
2000-02-17
1999-12-12
1999-02-28
1997-07-01