The API to retrieve serie elements presents some inconsistencies

Hello, I personally feel there is a bit of mess in the way to select elements in a Series :-)

The general idea is that .iloc and .loc have consistent behaviour for respectively demanding a position-based or a index(label)-based value, but are a bit slower than .ix and using directly [] which behaviour is not always consistent.
But I found these methods a bit inconsistent, also in terms of what to return if the labels are not found or the required position are out or range in the looked-up Series.

I compiled the following tables, that summarises the behaviour of these 4 methods of lookup depending (a) if the Series to look-up has an integer or a string index (I do not consider for the moment the date index), (b) if the required data is a single element, a slice index or a list (yes, the behaviour change!) and (c) if the index is found or not in the data.

The following tables works with pandas 0.17.1, NumPy 1.10.4, Python 3.4.3.
### Case 1: Series with Integer index

```
s = pd.Series(np.arange(100,105), index=np.arange(10,15))
s
10    100
11    101
12    102
13    103
14    104
```

```
** Single element **             ** Slice **                                       ** Tuple **
s[0]       -> LAB -> KeyError    s[0:2]        -> POS -> {10:100, 11:101}          s[[1,3]]        -> LAB -> {1:NaN, 3:Nan}
s[13]      -> LAB -> 103         s[10:12]      -> POS -> empty Series              s[[12,14]]      -> LAB -> {12:102, 14:104}
---                              ---                                               ---
s.ix[0]    -> LAB -> KeyError    s.ix[0:2]     -> LAB -> empty Series              s.ix[[1,3]]     -> LAB -> {1:NaN, 3:Nan}
s.ix[13]   -> LAB -> 103         s.ix[10:12]   -> LAB -> {10:100, 11:101, 12:102}  s.ix[[12,14]]   -> LAB -> {12:102, 14:104}
---                              ---                                               ---
s.iloc[0]  -> POS -> 100         s.iloc[0:2]   -> POS -> {10:100, 11:101}          s.iloc[[1,3]]   -> POS -> {11:101, 13:103}
s.iloc[13] -> POS -> IndexError  s.iloc[10:12] -> POS -> empty Series              s.iloc[[12,14]] -> POS -> IndexError
---                              ---                                               ---
s.loc[0]   -> LAB -> KeyError    s.loc[0:2]    -> LAB -> empty Series              s.loc[[1,3]]    -> LAB -> KeyError
s.loc[13]  -> LAB -> 103         s.loc[10:12]  -> LAB -> {10:100, 11:101, 12:102}  s.loc[[12,14]]  -> LAB -> {12:102, 14:104}
```
### Case 2: Series with string index

```
s = pd.Series(np.arange(100,105), index=['a','b','c','d','e'])
s
a    100
b    101
c    102
d    103
e    104
```

```
** Single element **                             ** Slice **                                           ** Tuple **
s[0]        -> POS -> 100                        s[0:2]          -> POS -> {'a':100,'b':101}           s[[0,2]]          -> POS -> {'a':100,'c':102} 
s[10]       -> LAB, POS -> KeyError, IndexError  s[10:12]        -> POS -> Empty Series                s[[10,12]]        -> POS -> IndexError 
s['a']      -> LAB -> 100                        s['a':'c']      -> LAB -> {'a':100,'b':101, 'c':102}  s[['a','c']]      -> LAB -> {'a':100,'b':101, 'c':102} 
s['g']      -> POS,LAB -> TypeError, KeyError    s['f':'h']      -> LAB -> Empty Series                s[['f','h']]      -> LAB -> {'f':NaN, 'h':NaN}
---                                              ---                                                   ---
s.ix[0]     -> POS -> 100                        s.ix[0:2]       -> POS -> {'a':100,'b':101}           s.ix[[0,2]]       -> POS -> {'a':100,'c':102} 
s.ix[10]    -> POS -> IndexError                 s.ix[10:12]     -> POS -> Empty Series                s.ix[[10,12]]     -> POS -> IndexError 
s.ix['a']   -> LAB -> 100                        s.ix['a':'c']   -> LAB -> {'a':100,'b':101, 'c':102}  s.ix[['a','c']]   -> LAB -> {'a':100,'b':101, 'c':102} 
s.ix['g']   -> POS, LAB -> TypeError, KeyError   s.ix['f':'h']   -> LAB -> Empty Series                s.ix[['f','h']]   -> LAB -> {'f':NaN, 'h':NaN}
---                                              ---                                                   ---
s.iloc[0]   -> POS -> 100                        s.iloc[0:2]     -> POS -> {'a':100,'b':101}           s.iloc[[0,2]]     -> POS -> {'a':100,'c':102} 
s.iloc[10]  -> POS -> IndexError                 s.iloc[10:12]   -> POS -> Empty Series                s.iloc[[10,12]]   -> POS -> IndexError 
s.iloc['a'] -> LAB -> TypeError                  s.iloc['a':'c'] -> POS -> ValueError                  s.iloc[['a','c']] -> POS -> TypeError    
s.iloc['g'] -> LAB -> TypeError                  s.iloc['f':'h'] -> POS -> ValueError                  s.iloc[['f','h']] -> POS -> TypeError
---                                              ---                                                   ---
s.loc[0]    -> LAB -> KeyError                   s.loc[0:2]     -> LAB -> TypeError                   s.loc[[0,2]]     -> LAB -> KeyError 
s.loc[10]   -> LAB -> KeyError                   s.loc[10:12]   -> LAB -> TypeError                   s.loc[[10,12]]   -> LAB -> KeyError 
s.loc['a']  -> LAB-> 100                         s.loc['a':'c'] -> LAB -> {'a':100,'b':101, 'c':102}  s.loc[['a','c']] -> LAB -> {'a':100,'c':102}    
s.loc['g']  -> LAB -> KeyError                   s.loc['f':'h'] -> LAB -> Empty Series                s.loc[['f','h']] -> LAB -> KeyError
```

As you can see there are several inconsistencies, some of them even using .iloc and .loc.

1) The event of not founding the elements/indexing out of range is managed in three different ways: an exception is thrown, a null Series is returned or a Series with the demanded keys associated to NaN values is returned. For example s.loc['f':'h'] returns an Empty Series when s.loc[['f','h']] returns instead a KeyError. There should be a single way to handle missing elements, and eventually an optional parameter should say what to do when missing elements are encountered.

2) When using slicers, if the lookup is by position, the end element is excluded, but when the lookup is by label the final element is included!

3) .ix is redundant. There should be .iloc[] and .loc[] to have a guaranteed query by position and label respectively, and a faster way with a more complicated logic (but still well documented) when performance is a priority. s[] is just quicker to type than s.ix[], so for me the latter method is redundant.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

The API to retrieve serie elements presents some inconsistencies #12890

Case 1: Series with Integer index

Case 2: Series with string index

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

The API to retrieve serie elements presents some inconsistencies #12890

Description

Case 1: Series with Integer index

Case 2: Series with string index

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions