seaborn使用(繪圖函數)


seaborn使用(繪圖函數)

數據集分布的可視化
分類數據的繪圖
線性關系可視化


一.數據集分布的可視化

  1. distplot
  2. kdeplot
  3. rugplot

1.distplot()

靈活的繪制單變量的分布,傳入一組一維數據
默認kde為True,縱坐標為在橫坐標區域內分布的概率,曲線表示概率密度函數,在區間上積分值為1
設置kde為False,縱坐標表示落在橫坐標bins中的數值的數量

seaborn.distplot(a, bins=None, hist=True, kde=True, rug=False, fit=None, hist_kws=None, kde_kws=None, rug_kws=None, fit_kws=None, color=None, vertical=False, norm_hist=False, axlabel=None, label=None, ax=None)

Parameters:

  • a:傳入的一維數據
  • bins:控制直方圖的豎直的長方形的數量
  • hist:是否繪制直方圖
  • kde:是否繪制高斯和密度估計曲線
  • rug:是否在坐標軸上繪制rug
  • fit:An object with fit method, returning a tuple that can be passed to a pdf method a positional arguments following an grid of values to evaluate the pdf on.
  • color:設置顏色
  • vertical:設置為true,觀察值在y軸
  • norm_hist:設置為true,直方圖顯示的是密度而不是count數
  • axlabel:Name for the support axis label
  • label:Legend label for the relevent component of the plot
  • ax:if provided, plot on this axis
    returns:
  • ax: matplotlib Axes.Returns the Axes object with the plot for further tweaking

2.kdeplot()

擬合或者繪畫單元變量或者是雙元變量的核密度估計

seaborn.kdeplot(data, data2=None, shade=False, vertical=False, kernel='gau', bw='scott', gridsize=100, cut=3, clip=None, legend=True, cumulative=False, shade_lowest=True, cbar=False, cbar_ax=None, cbar_kws=None, ax=None, **kwargs)

Parameters

  • data:Input data
  • data2:Second input data. If present, a bivariate KDE will be estimated.
  • shade:If True, shade in the area under the KDE curve (or draw with filled contours when data is bivariate)
  • vertical:If True, density is on x-axis
  • kernel:{‘gau’ | ‘cos’ | ‘biw’ | ‘epa’ | ‘tri’ | ‘triw’ } optional.
    Code for shape of kernel to fit with. Bivariate KDE can only use gaussian kernel.
  • bw:{‘scott’ | ‘silverman’ | scalar | pair of scalars }, optional
    Name of reference method to determine kernel size, scalar factor, or scalar for each dimension of the bivariate plot.
  • gridsize:int,optional.Number of discrete points in the evaluation grid.(評估網格中的離散點的數量)
  • cut:scalar,optional.Draw the estimate to cut * bw from the extreme data points.
  • clipt:Lower and upper bounds for datapoints used to fit KDE. Can provide a pair of (low, high) bounds for bivariate plots.定義上下界
  • legend:If True, add a legend or label the axes when possible.添加圖例
  • cumulative:If True, draw the cumulative distribution estimated by the kde.累積概率密度
  • shade_lowest:If True, shade the lowest contour of a bivariate KDE plot. Not relevant when drawing a univariate plot or when shade=False. Setting this to False can be useful when you want multiple densities on the same Axes.
  • cbar:If True and drawing a bivariate KDE plot, add a colorbar.
  • cbar_ax:Existing axes to draw the colorbar onto, otherwise space is taken from the main axes.
  • cbar_kws:Keyword arguments for fig.colorbar().
  • ax:Axes to plot on, otherwise uses current axes.
  • kwargs:Other keyword arguments are passed to plt.plot() or plt.contour{f} depending on whether a univariate or bivariate plot is being drawn.

Returns
ax:Axes with plot


3.regplot()

繪制數據的散點分布並且可以進行線性回歸模型擬合

seaborn.regplot(x, y, data=None, x_estimator=None, x_bins=None, x_ci='ci', scatter=True, fit_reg=True, ci=95, n_boot=1000, units=None, order=1, logistic=False, lowess=False, robust=False, logx=False, x_partial=None, y_partial=None, truncate=False, dropna=True, x_jitter=None, y_jitter=None, label=None, color=None, marker='o', scatter_kws=None, line_kws=None, ax=None)

patameters

  • x, y: string, series, or vector array
    Input variables. If strings, these should correspond with column names in data. When pandas objects are used, axes will be labeled with the series name.
  • data : DataFrame
    Tidy (“long-form”) dataframe where each column is a variable and each row is an observation.
  • x_estimator : callable that maps vector -> scalar, optional
    Apply this function to each unique value of x and plot the resulting estimate. This is useful when x is a discrete variable. If x_ci is given, this estimate will be bootstrapped and a confidence interval will be drawn.
  • x_bins : int or vector, optional
    Bin the x variable into discrete bins and then estimate the central tendency and a confidence interval. This binning only influences how the scatterplot is drawn; the regression is still fit to the original data. This parameter is interpreted either as the number of evenly-sized (not necessary spaced) bins or the positions of the bin centers. When this parameter is used, it implies that the default of x_estimator is numpy.mean.
  • x_ci : “ci”, “sd”, int in [0, 100] or None, optional
    Size of the confidence interval used when plotting a central tendency for discrete values of x. If "ci", defer to the value of the ci parameter. If "sd", skip bootstrappig and show the standard deviation of the observations in each bin.
  • scatter : bool, optional 是否繪制散點圖
    If True, draw a scatterplot with the underlying observations (or the x_estimator values).
  • fit_reg : bool, optional 是否繪制擬合曲線
    If True, estimate and plot a regression model relating the x and y variables.
  • ci : int in [0, 100] or None, optional 回歸估計的置信區間的大小
    Size of the confidence interval for the regression estimate. This will be drawn using translucent bands around the regression line. The confidence interval is estimated using a bootstrap; for large datasets, it may be advisable to avoid that computation by setting this parameter to None.
  • n_boot : int, optional
    Number of bootstrap resamples used to estimate the ci. The default value attempts to balance time and stability; you may want to increase this value for “final” versions of plots.
  • units : variable name in data, optional
    If the x and y observations are nested within sampling units, those can be specified here. This will be taken into account when computing the confidence intervals by performing a multilevel bootstrap that resamples both units and observations (within unit). This does not otherwise influence how the regression is estimated or drawn.
  • order : int, optional 如果order大於1,則用polyfit進行多項式回歸
    If order is greater than 1, use numpy.polyfit to estimate a polynomial regression.
  • logistic : bool, optional 邏輯回歸
    If True, assume that y is a binary variable and use statsmodels to estimate a logistic regression model. Note that this is substantially more computationally intensive than linear regression, so you may wish to decrease the number of bootstrap resamples (n_boot) or set ci to None.
  • lowess : bool, optional
    If True, use statsmodels to estimate a nonparametric lowess model (locally weighted linear regression). Note that confidence intervals cannot currently be drawn for this kind of model.
  • robust : bool, optional 減輕異常值,進行強回歸
    If True, use statsmodels to estimate a robust regression. This will de-weight outliers. Note that this is substantially more computationally intensive than standard linear regression, so you may wish to decrease the number of bootstrap resamples (n_boot) or set ci to None.
  • logx : bool, optional y=log(x)的回歸,x必須為正數
    If True, estimate a linear regression of the form y ~ log(x), but plot the scatterplot and regression model in the input space. Note that x must be positive for this to work.
  • {x,y}_partial : strings in data or matrices
    Confounding variables to regress out of the x or y variables before plotting.
  • truncate : bool, optional 截取一部分
    By default, the regression line is drawn to fill the x axis limits after the scatterplot is drawn. If truncate is True, it will instead by bounded by the data limits.
  • {x,y}_jitter : floats, optional 增加噪音值
    Add uniform random noise of this size to either the x or y variables. The noise is added to a copy of the data after fitting the regression, and only influences the look of the scatterplot. This can be helpful when plotting variables that take discrete values.
  • label : string
    Label to apply to ether the scatterplot or regression line (if scatter is False) for use in a legend.
  • color : matplotlib color
    Color to apply to all plot elements; will be superseded by colors passed in scatter_kws or line_kws.
  • marker : matplotlib marker code
    Marker to use for the scatterplot glyphs.
  • {scatter,line}_kws : dictionaries
    Additional keyword arguments to pass to plt.scatter and plt.plot.
  • ax : matplotlib Axes, optional
    Axes object to draw the plot onto, otherwise uses the current Axes.

Returns
ax : matplotlib Axes

The Axes object containing the plot


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM