數據可視化實例（十一）：矩陣圖（matplotlib，pandas）

本文轉載自查看原文 2020-05-15 22:01 2057 數據分析與繪圖實例

矩陣圖

https://datawhalechina.github.io/pms50/#/chapter9/chapter9

導入所需要的庫

import numpy as np              # 導入numpy庫
import pandas as pd             # 導入pandas庫
import matplotlib as mpl        # 導入matplotlib庫
import matplotlib.pyplot as plt
import seaborn as sns           # 導入seaborn庫
%matplotlib inline              # 在jupyter notebook顯示圖像

設定圖像各種屬性

large = 22; med = 16; small = 12

params = {'axes.titlesize': large,    # 設置子圖上的標題字體
            'legend.fontsize': med,     # 設置圖例的字體
            'figure.figsize': (16, 10), # 設置圖像的畫布
           'axes.labelsize': med,      # 設置標簽的字體
            'xtick.labelsize': med,     # 設置x軸上的標尺的字體
            'ytick.labelsize': med,     # 設置整個畫布的標題字體
          'figure.titlesize': large}  
#plt.rcParams.update(params)           # 更新默認屬性
plt.style.use('seaborn-whitegrid')    # 設定整體風格
sns.set_style("white")                # 設定整體背景風格

程序代碼

# step1:導入數據

df = sns.load_dataset('iris')

# step2: 繪制矩陣圖

    # 畫布
plt.figure(figsize = (12, 10),    # 畫布尺寸_(12, 10)
           dpi = 80)             # 分辨率_80
    # 矩陣圖
sns.pairplot(df,                                     # 使用的數據
            kind = 'scatter',                        # 繪制圖像的類型_scatter
            hue = 'species',                         # 類別的列，讓不同類別具有不談的顏色
            plot_kws = dict(s = 50,                  # 點的尺寸
                           edgecolor = 'white',      # 邊緣顏色
                           linewidth = 2.5))         # 線寬

# step1:導入數據

df = sns.load_dataset('iris')

# step2: 繪制矩陣圖

    # 畫布
plt.figure(figsize = (12, 10),    # 畫布尺寸_(12, 10)
           dpi = 80)             # 分辨率_80
    # 矩陣圖(帶有擬合線的散點圖)
sns.pairplot(df,                                     # 使用的數據
            kind = 'reg',                            # 繪制圖像的類型_reg
            hue = 'species')                         # 類別的列，讓不同類別具有不談的顏色

博文總結

seaborn.pairplot

seaborn.pairplot(data, hue=None, hue_order=None,
 palette=None, vars=None, x_vars=None, y_vars=None, kind='scatter', 
diag_kind='auto', markers=None, height=2.5, aspect=1, 
dropna=True, plot_kws=None, diag_kws=None, grid_kws=None, size=None)

Plot pairwise relationships in a dataset.

By default, this function will create a grid of Axes such that each variable in data will by shared in the y-axis across a single row and in the x-axis across a single column.

The diagonal Axes are treated differently, drawing a plot to show the univariate distribution of the data for the variable in that column.

It is also possible to show a subset of variables or plot different variables on the rows and columns.

This is a high-level interface for PairGrid that is intended to make it easy to draw a few common styles. You should use PairGriddirectly if you need more flexibility.

參數：data：DataFrame

Tidy (long-form) dataframe where each column is a variable and each row is an observation.

hue：string (variable name), optional

Variable in data to map plot aspects to different colors.

hue_order：list of strings

Order for the levels of the hue variable in the palette

palette：dict or seaborn color palette

Set of colors for mapping the hue variable. If a dict, keys should be values in the hue variable.

vars：list of variable names, optional

Variables within data to use, otherwise use every column with a numeric datatype.

{x, y}_vars：lists of variable names, optional

Variables within data to use separately for the rows and columns of the figure; i.e. to make a non-square plot.

kind：{‘scatter’, ‘reg’}, optional

Kind of plot for the non-identity relationships.

diag_kind：{‘auto’, ‘hist’, ‘kde’}, optional

Kind of plot for the diagonal subplots. The default depends on whether "hue" is used or not.

markers：single matplotlib marker code or list, optional

Either the marker to use for all datapoints or a list of markers with a length the same as the number of levels in the hue variable so that differently colored points will also have different scatterplot markers.

height：scalar, optional

Height (in inches) of each facet.

aspect：scalar, optional

Aspect * height gives the width (in inches) of each facet.

dropna：boolean, optional

Drop missing values from the data before plotting.

{plot, diag, grid}_kws：dicts, optional

Dictionaries of keyword arguments.

返回值：grid：PairGrid

Returns the underlying PairGrid instance for further tweaking.

seaborn.load_dataset

seaborn.load_dataset(name, cache=True, data_home=None, **kws)

從在線庫中獲取數據集（需要聯網）。

參數：name：字符串

數據集的名字 (<cite>name</cite>.csv on https://github.com/mwaskom/seaborn-data)。您可以通過 get_dataset_names() 獲取可用的數據集。

cache：boolean, 可選

如果為True，則在本地緩存數據並在后續調用中使用緩存。

data_home：string, 可選

用於存儲緩存數據的目錄。默認情況下使用 ~/seaborn-data/

kws：dict, 可選

傳遞給 pandas.read_csv

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

數據可視化實例（十一）： 矩陣圖（matplotlib，pandas）