Pandas中Series对象的唯一值
unique()函数用于获取Series对象的唯一值。
唯一性按出现顺序返回。基于哈希表的唯一,因此不排序
以NumPy数组形式返回唯一值。如果是扩展数组支持的Series,则返回仅具有唯一值的该类型的新ExtensionArray
The unique() function is used to get unique values of Series object.
Uniques are returned in order of appearance. Hash table-based unique, therefore does NOT sort.
Syntax:
Series.unique(self)
Returns: ndarray or ExtensionArray
The unique values returned as a NumPy array. See Notes.
Notes: Returns the unique values as a NumPy array. In case of an extension-array backed Series, a new ExtensionArray of that type with just the unique values is returned. This includes
- Categorical
- Period
- Datetime with Timezone
- Interval
- Sparse
- IntegerNA
Examples
import numpy as np import pandas as pd
pd.Series([2, 4, 3, 3], name='P').unique()
pd.Series([pd.Timestamp('2019-01-01') for _ in range(3)]).unique()
pd.Series([pd.Timestamp('2019-01-01', tz='US/Eastern') for _ in range(3)]).unique()
An unordered Categorical will return categories in the order of appearance.
pd.Series(pd.Categorical(list('qppqr'))).unique()
An ordered Categorical preserves the category ordering.
pd.Series(pd.Categorical(list('qppqr'), categories=list('pqr'), ordered=True)).unique()