Seaborn分布數據可視化---散點分布圖

本文轉載自查看原文 2022-01-07 12:38 1126 Seaborn

散點分布圖

綜合表示散點圖和直方分布圖。

Jointplot()

繪制二變量或單變量的圖形，底層是JointGrid()。

sns.jointplot(
    x,
    y,
    data=None,
    kind='scatter',
    stat_func=None,
    color=None,
    height=6,
    ratio=5,
    space=0.2,
    dropna=True,
    xlim=None,
    ylim=None,
    joint_kws=None,
    marginal_kws=None,
    annot_kws=None,
    **kwargs,
)
Docstring:
Draw a plot of two variables with bivariate and univariate graphs.

This function provides a convenient interface to the :class:`JointGrid`
class, with several canned plot kinds. This is intended to be a fairly
lightweight wrapper; if you need more flexibility, you should use
:class:`JointGrid` directly.

Parameters
----------
x, y : strings or vectors
    Data or names of variables in ``data``.
data : DataFrame, optional
    DataFrame when ``x`` and ``y`` are variable names.
kind : { "scatter" | "reg" | "resid" | "kde" | "hex" }, optional
    Kind of plot to draw.
stat_func : callable or None, optional
    *Deprecated*
color : matplotlib color, optional
    Color used for the plot elements.
height : numeric, optional
    Size of the figure (it will be square).
ratio : numeric, optional
    Ratio of joint axes height to marginal axes height.
space : numeric, optional
    Space between the joint and marginal axes
dropna : bool, optional
    If True, remove observations that are missing from ``x`` and ``y``.
{x, y}lim : two-tuples, optional
    Axis limits to set before plotting.
{joint, marginal, annot}_kws : dicts, optional
    Additional keyword arguments for the plot components.
kwargs : key, value pairings
    Additional keyword arguments are passed to the function used to
    draw the plot on the joint Axes, superseding items in the
    ``joint_kws`` dictionary.

Returns
-------
grid : :class:`JointGrid`
    :class:`JointGrid` object with the plot on it.

See Also
--------
JointGrid : The Grid class used for drawing this plot. Use it directly if
            you need more flexibility.

#綜合散點分布圖-jointplot

#創建DataFrame數組
rs = np.random.RandomState(3)
df = pd.DataFrame(rs.randn(200,2), columns=['A','B'])

#繪制綜合散點分布圖jointplot()
sns.jointplot(x=df['A'], y=df['B'],     #設置x和y軸的數據
              data=df,                  #設置數據
              color='k',
              s=50, edgecolor='w', linewidth=1,  #散點大小、邊緣線顏色和寬度（只針對scatter）
              kind='scatter',                    #默認類型：“scatter”，其他有“reg”、“resid”、“kde” 
              space=0.2,                         #設置散點圖和布局圖的間距
              height=8,                          #圖表的大小（自動調整為正方形）
              ratio=5,                           #散點圖與布局圖高度比率
              stat_func= sci.pearsonr,           #pearson相關系數           
              marginal_kws=dict(bins=15, rug=True))    #邊際圖的參數

sns.jointplot(x=df['A'], y=df['B'],
              data=df,
              color='k',
              kind='reg',             #reg添加線性回歸線
              height=8,
              ratio=5,
              stat_func= sci.pearsonr, 
              marginal_kws=dict(bins=15, rug=True))

sns.jointplot(x=df['A'], y=df['B'],
              data=df,
              color='k',
              kind='resid',             #resid
              height=8,
              ratio=5, 
              marginal_kws=dict(bins=15, rug=True))

sns.jointplot(x=df['A'], y=df['B'],
              data=df,
              color='k',
              kind='kde',             #kde密度圖
              height=8,
              ratio=5)

sns.jointplot(x=df['A'], y=df['B'],
              data=df,
              color='k',
              kind='hex',             #hex蜂窩圖(六角形)
              height=8,
              ratio=5)

g = sns.jointplot(x=df['A'], y=df['B'],
              data=df,
              color='k',
              kind='kde',             #kde密度圖
              height=8,
              ratio=5,
              shade_lowest=False)

#添加散點圖(c-->顏色，s-->大小)
g.plot_joint(plt.scatter, c='w', s=10, linewidth=1, marker='+')

JointGrid()

創建圖形網格，用於繪制二變量或單變量的圖形，作用和Jointplot()一樣，不過比Jointplot()更靈活。

sns.JointGrid(
    x,
    y,
    data=None,
    height=6,
    ratio=5,
    space=0.2,
    dropna=True,
    xlim=None,
    ylim=None,
    size=None,
)
Docstring:      Grid for drawing a bivariate plot with marginal univariate plots.
Init docstring:
Set up the grid of subplots.

Parameters
----------
x, y : strings or vectors
    Data or names of variables in ``data``.
data : DataFrame, optional
    DataFrame when ``x`` and ``y`` are variable names.
height : numeric
    Size of each side of the figure in inches (it will be square).
ratio : numeric
    Ratio of joint axes size to marginal axes height.
space : numeric, optional
    Space between the joint and marginal axes
dropna : bool, optional
    If True, remove observations that are missing from `x` and `y`.
{x, y}lim : two-tuples, optional
    Axis limits to set before plotting.

See Also
--------
jointplot : High-level interface for drawing bivariate plots with
            several different default plot kinds.

#設置風格
sns.set_style('white')
#導入數據
tip_datas = sns.load_dataset('tips', data_home='seaborn-data')

#繪制繪圖網格，包含三部分：一個主繪圖區域，兩個邊際繪圖區域
g = sns.JointGrid(x='total_bill', y='tip', data=tip_datas)

#主繪圖區域：散點圖
g.plot_joint(plt.scatter, color='m', edgecolor='w', alpha=.3)

#邊際繪圖區域：x和y軸
g.ax_marg_x.hist(tip_datas['total_bill'], color='b', alpha=.3)
g.ax_marg_y.hist(tip_datas['tip'], color='r', alpha=.3,
                 orientation='horizontal')

#相關系數標簽
from scipy import stats
g.annotate(stats.pearsonr)

#繪制表格線
plt.grid(linestyle='--')

g = sns.JointGrid(x='total_bill', y='tip', data=tip_datas)
g = g.plot_joint(plt.scatter, color='g', s=40, edgecolor='white')
plt.grid(linestyle='--')
#兩邊邊際圖用統一函數設置統一風格
g.plot_marginals(sns.distplot, kde=True, color='g')

g = sns.JointGrid(x='total_bill', y='tip', data=tip_datas)
#主繪圖設置密度圖
g = g.plot_joint(sns.kdeplot, cmap='Reds_r')
plt.grid(linestyle='--')
#兩邊邊際圖用統一函數設置統一風格
g.plot_marginals(sns.distplot, kde=True, color='g')

Pairplot()

用於數據集的相關性圖形繪制，如：矩陣圖，底層是PairGrid()。

sns.pairplot(
    data,
    hue=None,
    hue_order=None,
    palette=None,
    vars=None,
    x_vars=None,
    y_vars=None,
    kind='scatter',
    diag_kind='auto',
    markers=None,
    height=2.5,
    aspect=1,
    dropna=True,
    plot_kws=None,
    diag_kws=None,
    grid_kws=None,
    size=None,
)
Docstring:
Plot pairwise relationships in a dataset.

By default, this function will create a grid of Axes such that each
variable in ``data`` will by shared in the y-axis across a single row and
in the x-axis across a single column. The diagonal Axes are treated
differently, drawing a plot to show the univariate distribution of the data
for the variable in that column.

It is also possible to show a subset of variables or plot different
variables on the rows and columns.

This is a high-level interface for :class:`PairGrid` that is intended to
make it easy to draw a few common styles. You should use :class:`PairGrid`
directly if you need more flexibility.

Parameters
----------
data : DataFrame
    Tidy (long-form) dataframe where each column is a variable and
    each row is an observation.
hue : string (variable name), optional
    Variable in ``data`` to map plot aspects to different colors.
hue_order : list of strings
    Order for the levels of the hue variable in the palette
palette : dict or seaborn color palette
    Set of colors for mapping the ``hue`` variable. If a dict, keys
    should be values  in the ``hue`` variable.
vars : list of variable names, optional
    Variables within ``data`` to use, otherwise use every column with
    a numeric datatype.
{x, y}_vars : lists of variable names, optional
    Variables within ``data`` to use separately for the rows and
    columns of the figure; i.e. to make a non-square plot.
kind : {'scatter', 'reg'}, optional
    Kind of plot for the non-identity relationships.
diag_kind : {'auto', 'hist', 'kde'}, optional
    Kind of plot for the diagonal subplots. The default depends on whether
    ``"hue"`` is used or not.
markers : single matplotlib marker code or list, optional
    Either the marker to use for all datapoints or a list of markers with
    a length the same as the number of levels in the hue variable so that
    differently colored points will also have different scatterplot
    markers.
height : scalar, optional
    Height (in inches) of each facet.
aspect : scalar, optional
    Aspect * height gives the width (in inches) of each facet.
dropna : boolean, optional
    Drop missing values from the data before plotting.
{plot, diag, grid}_kws : dicts, optional
    Dictionaries of keyword arguments.

Returns
-------
grid : PairGrid
    Returns the underlying ``PairGrid`` instance for further tweaking.

See Also
--------
PairGrid : Subplot grid for more flexible plotting of pairwise
           relationships.

#導入鳶尾花數據
i_datas = sns.load_dataset('iris', data_home='seaborn-data')
i_datas

#矩陣散點圖
sns.pairplot(i_datas,
             kind='scatter',                 #圖形類型（散點圖：scatter, 回歸分布圖：reg）
             diag_kind='hist',               #對角線的圖形類型（直方圖：hist, 密度圖：kde）
             hue='species',                  #按照某一字段分類
             palette='husl',                 #設置調色板
             markers=['o','s','D'],          #設置點樣式
             height=2)                       #設置圖標大小

#矩陣回歸分析圖
sns.pairplot(i_datas,
             kind='reg',                     #圖形類型（散點圖：scatter, 回歸分布圖：reg）
             diag_kind='kde',                #對角線的圖形類型（直方圖：hist, 密度圖：kde）
             hue='species',                  #按照某一字段分類
             palette='husl',                 #設置調色板
             markers=['o','s','D'],          #設置點樣式
             height=2)                       #設置圖標大小

#局部變量選擇,vars
g = sns.pairplot(i_datas, vars=['sepal_width', 'sepal_length'],
                 kind='reg', diag_kind='kde',
                 hue='species', palette='husl')

#綜合參數設置
sns.pairplot(i_datas, diag_kind='kde', markers='+', hue='species',
             #散點圖的參數
             plot_kws=dict(s=50, edgecolor='b', linewidth=1),
             #對角線圖的參數
             diag_kws=dict(shade=True))

PairGrid()

用於數據集的相關性圖形繪制，如：矩陣圖。功能比Pairplot()更加靈活。

sns.PairGrid(
    data,
    hue=None,
    hue_order=None,
    palette=None,
    hue_kws=None,
    vars=None,
    x_vars=None,
    y_vars=None,
    diag_sharey=True,
    height=2.5,
    aspect=1,
    despine=True,
    dropna=True,
    size=None,
)
Docstring:     
Subplot grid for plotting pairwise relationships in a dataset.

This class maps each variable in a dataset onto a column and row in a
grid of multiple axes. Different axes-level plotting functions can be
used to draw bivariate plots in the upper and lower triangles, and the
the marginal distribution of each variable can be shown on the diagonal.

It can also represent an additional level of conditionalization with the
``hue`` parameter, which plots different subets of data in different
colors. This uses color to resolve elements on a third dimension, but
only draws subsets on top of each other and will not tailor the ``hue``
parameter for the specific visualization the way that axes-level functions
that accept ``hue`` will.

See the :ref:`tutorial <grid_tutorial>` for more information.
Init docstring:
Initialize the plot figure and PairGrid object.

Parameters
----------
data : DataFrame
    Tidy (long-form) dataframe where each column is a variable and
    each row is an observation.
hue : string (variable name), optional
    Variable in ``data`` to map plot aspects to different colors.
hue_order : list of strings
    Order for the levels of the hue variable in the palette
palette : dict or seaborn color palette
    Set of colors for mapping the ``hue`` variable. If a dict, keys
    should be values  in the ``hue`` variable.
hue_kws : dictionary of param -> list of values mapping
    Other keyword arguments to insert into the plotting call to let
    other plot attributes vary across levels of the hue variable (e.g.
    the markers in a scatterplot).
vars : list of variable names, optional
    Variables within ``data`` to use, otherwise use every column with
    a numeric datatype.
{x, y}_vars : lists of variable names, optional
    Variables within ``data`` to use separately for the rows and
    columns of the figure; i.e. to make a non-square plot.
height : scalar, optional
    Height (in inches) of each facet.
aspect : scalar, optional
    Aspect * height gives the width (in inches) of each facet.
despine : boolean, optional
    Remove the top and right spines from the plots.
dropna : boolean, optional
    Drop missing values from the data before plotting.

See Also
--------
pairplot : Easily drawing common uses of :class:`PairGrid`.
FacetGrid : Subplot grid for plotting conditional relationships.

#繪制四個參數vars的繪圖網格(subplots)
g = sns.PairGrid(i_datas, hue='species', palette='hls',
                 vars=['sepal_length', 'sepal_width', 'petal_length', 'petal_width'])

#對角線圖形繪制
g.map_diag(plt.hist,
           histtype='step',             #可選：'bar'\ 'barstacked'\'step'\'stepfilled'
           linewidth=1)

#非對角線圖形繪制
g.map_offdiag(plt.scatter, s=40, linewidth=1)

#添加圖例
g.add_legend()

g = sns.PairGrid(i_datas)

#主對角線圖形
g.map_diag(sns.kdeplot)

#上三角圖形
g.map_upper(plt.scatter)

#下三角圖形
g.map_lower(sns.kdeplot, cmap='Blues_d')

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。