以下操作都是翻譯的官方文檔(不全)
安裝seaborn
pip3 install seaborn
seborn加載數據集
import seaborn as sb
df = sb.load_dataset('tips')
print(type(df))
print(df.head)
可以看出df的類型是<class 'pandas.core.frame.DataFrame'>,所以在使用seaborn之前需要安裝pandas
查看支持多少數據集
import seaborn as sb
print(sb.get_dataset_names())
不安裝bs4會報錯
pip3 install bs4

matplotlib畫圖
import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
sinplot()
plt.show()

sb.set()
轉化為seaborn的默認格式
import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
import seaborn as sb
sb.set()
sinplot()
plt.show()

set_style()
設置畫圖格式,可供選擇的如下
- white
- dark
- whitegrid
- darkgrid
- ticks
如下
import numpy as np
from matplotlib import pyplot as plt
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 5):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
import seaborn as sb
sb.set_style('whitegrid')
sinplot()
plt.show()

sb.despine()
在white 和 ticks 主題下,可以通過這個函數去掉上部和右側的圖像邊框線
自定義樣式
sb.set_style()
查看格式中有多少元素,比入線的粗細等等

一個demo
sb.set_style("darkgrid", {'axes.axisbelow': False})
直方圖
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris') #鳶尾花數據集
sb.distplot(df['petal_length'],kde=False)
plt.show()
其中kde設置為False代表只畫直方圖,設置為True代表除了直方圖還有折線圖


散點圖
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.jointplot(x='petal_length',y='petal_width',data=df)
plt.show()

六邊形二元化方法
在數據密度稀疏的情況下,二元數據分析采用六邊形二元化方法,當數據非常零散且難以通過散點圖進行分析時可以用這種方法。
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.jointplot(x='petal_length',y='petal_width',data=df,kind='hex')
plt.show()

核密度估計
核密度估計是一種估計變量分布的非參數方法。在seaborn中,我們可以使用jointplot()
kind參數使用kde
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.jointplot(x='petal_length',y='petal_width',data=df,kind='kde')
plt.show() v

核密度估計
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.distplot(df['petal_length'],hist=False)
plt.show()
hist設置為false即生成核密度估計圖

可視化成對關系
seaborn.pairplot(data,…)
參數介紹
| parameter | description |
|---|---|
| data | Datafram |
| hue | Variable in data to map plot aspects to different colors. |
| palette | Set of colors for mapping the hue variable |
| kind | Kind of plot for the non-identity relationships. {‘scatter’, ‘reg’} |
| diag_kind | Kind of plot for the diagonal subplots. {‘hist’, ‘kde’} |
直接復制的官方文檔的介紹
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.set_style("ticks")
sb.pairplot(df,hue='species',diag_kind="kde",kind="scatter",palette="husl")
plt.show()

繪制分類數據
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.stripplot(x="species", y="petal_length", data=df)
plt.show()

Swarmplot()
另外一種風格
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.swarmplot(x="species", y="petal_length", data=df)
plt.show()

柱狀圖
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('titanic')
sb.barplot(x="sex", y="survived", hue="class", data=df)
plt.show()

省略了幾種柱狀圖
線性關系
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('tips')
sb.regplot(x="total_bill", y="tip", data=df)
sb.lmplot(x="total_bill", y="tip", data=df)
plt.show()
regplot vs lmplot
| regplot | lmplot |
|---|---|
| 接受各種格式的x和y變量,包括簡單的numpy數組、pandas系列對象,或作為pandas數據幀中變量的引用 | 將數據作為必需參數,並且必須將x和y變量指定為字符串。這種數據格式稱為“長格式”數據 |

當其中一個變量取離散值時,我們也可以擬合線性回歸
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('tips')
sb.lmplot(x="size", y="tip", data=df)
plt.show()

熱力圖
import numpy as np; np.random.seed(0)
import seaborn as sb;
sb.set()
uniform_data = np.random.rand(10, 12)
ax = sb.heatmap(uniform_data)

玩了一下powerbi嵌入代碼
