矩陣圖
https://datawhalechina.github.io/pms50/#/chapter9/chapter9
導入所需要的庫
import numpy as np # 導入numpy庫 import pandas as pd # 導入pandas庫 import matplotlib as mpl # 導入matplotlib庫 import matplotlib.pyplot as plt import seaborn as sns # 導入seaborn庫 %matplotlib inline # 在jupyter notebook顯示圖像
設定圖像各種屬性
large = 22; med = 16; small = 12 params = {'axes.titlesize': large, # 設置子圖上的標題字體 'legend.fontsize': med, # 設置圖例的字體 'figure.figsize': (16, 10), # 設置圖像的畫布 'axes.labelsize': med, # 設置標簽的字體 'xtick.labelsize': med, # 設置x軸上的標尺的字體 'ytick.labelsize': med, # 設置整個畫布的標題字體 'figure.titlesize': large} #plt.rcParams.update(params) # 更新默認屬性 plt.style.use('seaborn-whitegrid') # 設定整體風格 sns.set_style("white") # 設定整體背景風格
程序代碼
# step1:導入數據
df = sns.load_dataset('iris')
# step2: 繪制矩陣圖
# 畫布 plt.figure(figsize = (12, 10), # 畫布尺寸_(12, 10) dpi = 80) # 分辨率_80 # 矩陣圖 sns.pairplot(df, # 使用的數據 kind = 'scatter', # 繪制圖像的類型_scatter hue = 'species', # 類別的列,讓不同類別具有不談的顏色 plot_kws = dict(s = 50, # 點的尺寸 edgecolor = 'white', # 邊緣顏色 linewidth = 2.5)) # 線寬
# step1:導入數據
df = sns.load_dataset('iris')
# step2: 繪制矩陣圖
# 畫布 plt.figure(figsize = (12, 10), # 畫布尺寸_(12, 10) dpi = 80) # 分辨率_80 # 矩陣圖(帶有擬合線的散點圖) sns.pairplot(df, # 使用的數據 kind = 'reg', # 繪制圖像的類型_reg hue = 'species') # 類別的列,讓不同類別具有不談的顏色
博文總結
seaborn.pairplot
seaborn.pairplot(data, hue=None, hue_order=None,
palette=None, vars=None, x_vars=None, y_vars=None, kind='scatter',
diag_kind='auto', markers=None, height=2.5, aspect=1,
dropna=True, plot_kws=None, diag_kws=None, grid_kws=None, size=None)
Plot pairwise relationships in a dataset.
By default, this function will create a grid of Axes such that each variable in data
will by shared in the y-axis across a single row and in the x-axis across a single column.
The diagonal Axes are treated differently, drawing a plot to show the univariate distribution of the data for the variable in that column.
It is also possible to show a subset of variables or plot different variables on the rows and columns.
This is a high-level interface for PairGrid
that is intended to make it easy to draw a few common styles. You should use PairGrid
directly if you need more flexibility.
參數:data
:DataFrame
Tidy (long-form) dataframe where each column is a variable and each row is an observation.
hue
:string (variable name), optional
Variable in
data
to map plot aspects to different colors.
hue_order
:list of strings
Order for the levels of the hue variable in the palette
palette
:dict or seaborn color palette
Set of colors for mapping the
hue
variable. If a dict, keys should be values in thehue
variable.
vars
:list of variable names, optional
Variables within
data
to use, otherwise use every column with a numeric datatype.
{x, y}_vars
:lists of variable names, optional
Variables within
data
to use separately for the rows and columns of the figure; i.e. to make a non-square plot.
kind
:{‘scatter’, ‘reg’}, optional
Kind of plot for the non-identity relationships.
diag_kind
:{‘auto’, ‘hist’, ‘kde’}, optional
Kind of plot for the diagonal subplots. The default depends on whether
"hue"
is used or not.
markers
:single matplotlib marker code or list, optional
Either the marker to use for all datapoints or a list of markers with a length the same as the number of levels in the hue variable so that differently colored points will also have different scatterplot markers.
height
:scalar, optional
Height (in inches) of each facet.
aspect
:scalar, optional
Aspect * height gives the width (in inches) of each facet.
dropna
:boolean, optional
Drop missing values from the data before plotting.
{plot, diag, grid}_kws
:dicts, optional
Dictionaries of keyword arguments.
返回值:grid
:PairGrid
Returns the underlying
PairGrid
instance for further tweaking.
seaborn.load_dataset
seaborn.load_dataset(name, cache=True, data_home=None, **kws)
從在線庫中獲取數據集(需要聯網)。
參數:name
:字符串
數據集的名字 (<cite>name</cite>.csv on https://github.com/mwaskom/seaborn-data)。 您可以通過
get_dataset_names()
獲取可用的數據集。
cache
:boolean, 可選
如果為True,則在本地緩存數據並在后續調用中使用緩存。
data_home
:string, 可選
用於存儲緩存數據的目錄。 默認情況下使用 ~/seaborn-data/
kws
:dict, 可選
傳遞給 pandas.read_csv