以下操作都是翻譯的官方文檔(不全)
安裝seaborn
pip3 install seaborn
seborn加載數據集
import seaborn as sb
df = sb.load_dataset('tips')
print(type(df))
print(df.head)
可以看出df的類型是<class 'pandas.core.frame.DataFrame'>,所以在使用seaborn之前需要安裝pandas
查看支持多少數據集
import seaborn as sb
print(sb.get_dataset_names())
不安裝bs4會報錯
pip3 install bs4
matplotlib畫圖
import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
sinplot()
plt.show()
sb.set()
轉化為seaborn的默認格式
import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
import seaborn as sb
sb.set()
sinplot()
plt.show()
set_style()
設置畫圖格式,可供選擇的如下
- white
- dark
- whitegrid
- darkgrid
- ticks
如下
import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
import seaborn as sb
sb.set_style('whitegrid')
sinplot()
plt.show()
sb.despine()
在white 和 ticks 主題下,可以通過這個函數去掉上部和右側的圖像邊框線
自定義樣式
sb.set_style()
查看格式中有多少元素,比入線的粗細等等
一個demo
sb.set_style("darkgrid", {'axes.axisbelow': False})
直方圖
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris') #鳶尾花數據集
sb.distplot(df['petal_length'],kde=False)
plt.show()
其中kde設置為False代表只畫直方圖,設置為True代表除了直方圖還有折線圖
散點圖
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.jointplot(x='petal_length',y='petal_width',data=df)
plt.show()
六邊形二元化方法
在數據密度稀疏的情況下,二元數據分析采用六邊形二元化方法,當數據非常零散且難以通過散點圖進行分析時可以用這種方法。
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.jointplot(x='petal_length',y='petal_width',data=df,kind='hex')
plt.show()
核密度估計
核密度估計是一種估計變量分布的非參數方法。在seaborn中,我們可以使用jointplot()
kind參數使用kde
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.jointplot(x='petal_length',y='petal_width',data=df,kind='kde')
plt.show() v
核密度估計
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.distplot(df['petal_length'],hist=False)
plt.show()
hist設置為false即生成核密度估計圖
可視化成對關系
seaborn.pairplot(data,…)
參數介紹
parameter | description |
---|---|
data | Datafram |
hue | Variable in data to map plot aspects to different colors. |
palette | Set of colors for mapping the hue variable |
kind | Kind of plot for the non-identity relationships. {‘scatter’, ‘reg’} |
diag_kind | Kind of plot for the diagonal subplots. {‘hist’, ‘kde’} |
直接復制的官方文檔的介紹
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.set_style("ticks")
sb.pairplot(df,hue='species',diag_kind="kde",kind="scatter",palette="husl")
plt.show()
繪制分類數據
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.stripplot(x="species", y="petal_length", data=df)
plt.show()
Swarmplot()
另外一種風格
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.swarmplot(x="species", y="petal_length", data=df)
plt.show()
柱狀圖
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('titanic')
sb.barplot(x="sex", y="survived", hue="class", data=df)
plt.show()
省略了幾種柱狀圖
線性關系
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('tips')
sb.regplot(x="total_bill", y="tip", data=df)
sb.lmplot(x="total_bill", y="tip", data=df)
plt.show()
regplot vs lmplot
regplot | lmplot |
---|---|
接受各種格式的x和y變量,包括簡單的numpy數組、pandas系列對象,或作為pandas數據幀中變量的引用 | 將數據作為必需參數,並且必須將x和y變量指定為字符串。這種數據格式稱為“長格式”數據 |
當其中一個變量取離散值時,我們也可以擬合線性回歸
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('tips')
sb.lmplot(x="size", y="tip", data=df)
plt.show()
熱力圖
import numpy as np; np.random.seed(0)
import seaborn as sb;
sb.set()
uniform_data = np.random.rand(10, 12)
ax = sb.heatmap(uniform_data)
玩了一下powerbi嵌入代碼