pandas里的describe中top,freq

本文轉載自查看原文 2022-01-03 13:48 1481 pandas/numpy

unique,top和freq似乎是對字符串統計,對數值無計算

其中top有mode作用freq也指的是眾數的頻數,當分類數量都為1時,按unicode排序

缺失值由NaN補上，如果為NaN，說明此列的信息不可以用這個統計變量進行統計的。

注意，數值列和字母列是不一樣的。


例子出自官方文檔里:pandas.DataFrame.describe — pandas 1.3.5 documentation (pydata.org)
Examples
--------
>>> df = pd.DataFrame({'categorical': pd.Categorical(['d','e','f']),
...                    'numeric': [1, 2, 3],
...                    'object': ['a', 'b', 'c']
...                   })
>>> df.describe()
       numeric
count      3.0
mean       2.0
std        1.0
min        1.0
25%        1.5
50%        2.0
75%        2.5 
max        3.0

Describing all columns of a ``DataFrame`` regardless of data type.

>>> df.describe(include='all')  # doctest: +SKIP
       categorical  numeric object
count            3      3.0      3
unique           3      NaN      3
top              f      NaN      a
freq             1      NaN      1
mean           NaN      2.0    NaN
std            NaN      1.0    NaN
min            NaN      1.0    NaN
25%            NaN      1.5    NaN
50%            NaN      2.0    NaN
75%            NaN      2.5    NaN
max            NaN      3.0    NaN

Describing a column from a ``DataFrame`` by accessing it as
an attribute.

>>> df.numeric.describe()
count    3.0
mean     2.0
std      1.0
min      1.0
25%      1.5
50%      2.0
75%      2.5
max      3.0
Name: numeric, dtype: float64

Including only numeric columns in a ``DataFrame`` description.

>>> df.describe(include=[np.number])
       numeric
count      3.0
mean       2.0
std        1.0
min        1.0
25%        1.5
50%        2.0
75%        2.5
max        3.0

Including only string columns in a ``DataFrame`` description.

>>> df.describe(include=[object])  # doctest: +SKIP
       object
count       3
unique      3
top         a
freq        1

Including only categorical columns from a ``DataFrame`` description.

>>> df.describe(include=['category'])
       categorical
count            3
unique           3
top              f
freq             1

Excluding numeric columns from a ``DataFrame`` description.

>>> df.describe(exclude=[np.number])  # doctest: +SKIP
       categorical object
count            3      3
unique           3      3
top              f      a
freq             1      1

Excluding object columns from a ``DataFrame`` description.

>>> df.describe(exclude=[object])  # doctest: +SKIP
       categorical  numeric
count            3      3.0
unique           3      NaN
top              f      NaN
freq             1      NaN
mean           NaN      2.0
std            NaN      1.0
min            NaN      1.0
25%            NaN      1.5
50%            NaN      2.0
75%            NaN      2.5
max            NaN      3.0

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 pandas中的describe方法數據科學：Pandas 和 Series 的 describe() 方法數據分析pandas之DataFrame.describe() 用法概述 Python pandas.DataFrame.describe函數方法的使用 pb中describe、Evaluate、Lookupdisplay的用法 mysql中describe關鍵字 pandas.DataFrame.describe 官方文檔翻譯percentile_width，percentiles，include, exclude pandas dataframe, pandas series里的索引操作里的坑 DESCRIBE TABLE 關於PHP.ini的opcache中opcache.revalidate_freq參數設置測試報告