Series類實例的檢索s[key]
當pd.Series的索引是數值型類型時, 我們不可以通過s1[-1]來檢索其最后一行的值
正確的做法是: s1.iloc[-1] 或者 s1[len(s1) - 1] 或者 s1.values[-1]
python語言里的魔術方法之__getitem__使類能夠具有索引鍵功能. 也就是說instance[key]
可以檢索到key對應的元素的值. pandas的Series類就是_getitem__方法的集大成者. 它里面隱藏了
很多規則.
這里深挖一下它的源碼, 當Series的實例s1的索引是整型數時, 如果用[-1]索引鍵來檢索時會發生什么情況呢?
我們順藤摸瓜來跑一下程序的脈絡:
getitem()里調用了: ._get_value(-1)方法, 該方法調用了: .index.get_loc(-1)方法.
問題就出在這里了: .index._range.index(-1)
'-1' 這個索引鍵根本就不在s1的索引里. 因為我們的s1的索引是: range(1)
所以程序才會拋出異常: KeyError: -1
當pd.Series的索引是字符型時(比如s2實例), 我們可以用s2[-1]來檢索其最后一行的值
結論: series[key]這種檢索方法, 功能很強大, 但是使用時要注意其索引的類型, 避免掉到坑里. 或者用.iloc()的方法更加明確一些.
Signature: s1.__getitem__(key)
Source:
def __getitem__(self, key):
key = com.apply_if_callable(key, self)
if key is Ellipsis:
return self
key_is_scalar = is_scalar(key)
if isinstance(key, (list, tuple)):
key = unpack_1tuple(key)
if is_integer(key) and self.index._should_fallback_to_positional():
return self._values[key]
elif key_is_scalar:
return self._get_value(key)
if is_hashable(key):
# Otherwise index.get_value will raise InvalidIndexError
try:
# For labels that don't resolve as scalars like tuples and frozensets
result = self._get_value(key)
return result
except KeyError:
if isinstance(key, tuple) and isinstance(self.index, MultiIndex):
# We still have the corner case where a tuple is a key
# in the first level of our MultiIndex
return self._get_values_tuple(key)
if is_iterator(key):
key = list(key)
if com.is_bool_indexer(key):
key = check_bool_indexer(self.index, key)
key = np.asarray(key, dtype=bool)
return self._get_values(key)
return self._get_with(key)
File: d:\anaconda3\lib\site-packages\pandas\core\series.py
Type: method
Signature: s1._get_value(label, takeable:bool=False)
Source:
def _get_value(self, label, takeable: bool = False):
"""
Quickly retrieve single value at passed index label.
Parameters
----------
label : object
takeable : interpret the index as indexers, default False
Returns
-------
scalar value
"""
if takeable:
return self._values[label]
# Similar to Index.get_value, but we do not fall back to positional
loc = self.index.get_loc(label)
return self.index._get_values_for_loc(self, loc, label)
File: d:\anaconda3\lib\site-packages\pandas\core\series.py
Type: method
s1.index.get_loc??
Signature: s1.index.get_loc(key, method=None, tolerance=None)
Source:
@doc(Int64Index.get_loc)
def get_loc(self, key, method=None, tolerance=None):
if method is None and tolerance is None:
if is_integer(key) or (is_float(key) and key.is_integer()):
new_key = int(key)
try:
return self._range.index(new_key)
except ValueError as err:
raise KeyError(key) from err
raise KeyError(key)
return super().get_loc(key, method=method, tolerance=tolerance)
File: d:\anaconda3\lib\site-packages\pandas\core\indexes\range.py
Type: method
s1=pd.Series([111,222], range(2))
s2=pd.Series([111,222], list('ab'))
s1
Out[266]:
0 111
1 222
dtype: int64
s2
Out[267]:
a 111
b 222
dtype: int64
s2[-1]
Out[268]: 222
s1[-1]
Traceback (most recent call last):
File "<ipython-input-269-0123e3764900>", line 1, in <module>
s1[-1]
File "D:\Anaconda3\lib\site-packages\pandas\core\series.py", line 882, in __getitem__
return self._get_value(key)
File "D:\Anaconda3\lib\site-packages\pandas\core\series.py", line 989, in _get_value
loc = self.index.get_loc(label)
File "D:\Anaconda3\lib\site-packages\pandas\core\indexes\range.py", line 357, in get_loc
raise KeyError(key) from err
KeyError: -1
pd.DataFrame類實例的檢索df[key]
df是一個2D的數據結構, 它有兩個可以檢索的鍵: 或者是列名的組合或者是行名的組合(sliceable對象).
它的檢索規則更加隱藏和復雜. 總之: 提供了一種在行軸或者列軸上的切片操作.