數據可視化實例(十一): 矩陣圖(matplotlib,pandas)


矩陣圖

https://datawhalechina.github.io/pms50/#/chapter9/chapter9

導入所需要的庫

import numpy as np              # 導入numpy庫
import pandas as pd             # 導入pandas庫
import matplotlib as mpl        # 導入matplotlib庫
import matplotlib.pyplot as plt
import seaborn as sns           # 導入seaborn庫
%matplotlib inline              # 在jupyter notebook顯示圖像

設定圖像各種屬性

large = 22; med = 16; small = 12

params = {'axes.titlesize': large,    # 設置子圖上的標題字體
            'legend.fontsize': med,     # 設置圖例的字體
            'figure.figsize': (16, 10), # 設置圖像的畫布
           'axes.labelsize': med,      # 設置標簽的字體
            'xtick.labelsize': med,     # 設置x軸上的標尺的字體
            'ytick.labelsize': med,     # 設置整個畫布的標題字體
          'figure.titlesize': large}  
#plt.rcParams.update(params)           # 更新默認屬性
plt.style.use('seaborn-whitegrid')    # 設定整體風格
sns.set_style("white")                # 設定整體背景風格

程序代碼

# step1:導入數據

df = sns.load_dataset('iris')

# step2: 繪制矩陣圖

    # 畫布
plt.figure(figsize = (12, 10),    # 畫布尺寸_(12, 10)
           dpi = 80)             # 分辨率_80
    # 矩陣圖
sns.pairplot(df,                                     # 使用的數據
            kind = 'scatter',                        # 繪制圖像的類型_scatter
            hue = 'species',                         # 類別的列,讓不同類別具有不談的顏色
            plot_kws = dict(s = 50,                  # 點的尺寸
                           edgecolor = 'white',      # 邊緣顏色
                           linewidth = 2.5))         # 線寬

 

 

 

# step1:導入數據

df = sns.load_dataset('iris')

# step2: 繪制矩陣圖

    # 畫布
plt.figure(figsize = (12, 10),    # 畫布尺寸_(12, 10)
           dpi = 80)             # 分辨率_80
    # 矩陣圖(帶有擬合線的散點圖)
sns.pairplot(df,                                     # 使用的數據
            kind = 'reg',                            # 繪制圖像的類型_reg
            hue = 'species')                         # 類別的列,讓不同類別具有不談的顏色

 

 

博文總結

seaborn.pairplot

seaborn.pairplot(data, hue=None, hue_order=None,
palette=None, vars=None, x_vars=None, y_vars=None, kind='scatter',
diag_kind='auto', markers=None, height=2.5, aspect=1,
dropna=True, plot_kws=None, diag_kws=None, grid_kws=None, size=None)

Plot pairwise relationships in a dataset.

By default, this function will create a grid of Axes such that each variable in data will by shared in the y-axis across a single row and in the x-axis across a single column.

The diagonal Axes are treated differently, drawing a plot to show the univariate distribution of the data for the variable in that column.

It is also possible to show a subset of variables or plot different variables on the rows and columns.

This is a high-level interface for PairGrid that is intended to make it easy to draw a few common styles. You should use PairGriddirectly if you need more flexibility.

 

參數:data:DataFrame

Tidy (long-form) dataframe where each column is a variable and each row is an observation.

hue:string (variable name), optional

Variable in data to map plot aspects to different colors.

hue_order:list of strings

Order for the levels of the hue variable in the palette

palette:dict or seaborn color palette

Set of colors for mapping the hue variable. If a dict, keys should be values in the hue variable.

vars:list of variable names, optional

Variables within data to use, otherwise use every column with a numeric datatype.

{x, y}_vars:lists of variable names, optional

Variables within data to use separately for the rows and columns of the figure; i.e. to make a non-square plot.

kind:{‘scatter’, ‘reg’}, optional

Kind of plot for the non-identity relationships.

diag_kind:{‘auto’, ‘hist’, ‘kde’}, optional

Kind of plot for the diagonal subplots. The default depends on whether "hue" is used or not.

markers:single matplotlib marker code or list, optional

Either the marker to use for all datapoints or a list of markers with a length the same as the number of levels in the hue variable so that differently colored points will also have different scatterplot markers.

height:scalar, optional

Height (in inches) of each facet.

aspect:scalar, optional

Aspect * height gives the width (in inches) of each facet.

dropna:boolean, optional

Drop missing values from the data before plotting.

{plot, diag, grid}_kws:dicts, optional

Dictionaries of keyword arguments.

返回值:grid:PairGrid

Returns the underlying PairGrid instance for further tweaking.

seaborn.load_dataset

seaborn.load_dataset(name, cache=True, data_home=None, **kws)

從在線庫中獲取數據集(需要聯網)。

參數:name:字符串

數據集的名字 (<cite>name</cite>.csv on https://github.com/mwaskom/seaborn-data)。 您可以通過 get_dataset_names() 獲取可用的數據集。

cache:boolean, 可選

如果為True,則在本地緩存數據並在后續調用中使用緩存。

data_home:string, 可選

用於存儲緩存數據的目錄。 默認情況下使用 ~/seaborn-data/

kws:dict, 可選

傳遞給 pandas.read_csv


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM