Skip to content

KeyError when using pivot_table with an aggfunc and and empty column #9186

Closed
@brandonkane

Description

@brandonkane

This is a bit of an edge case, but in Pandas 0.15.2 when you try to pivot on an empty column you should get back an empty dataframe. This is the behaviour when the default aggregation function is used, but if you specify an aggfunc argument it fails.

Setup dataframe with empty column:

>>> df = pd.DataFrame({'A': [2,2,3,3,2], 'id': [5,6,7,8,9], 'C':['p', 'q', 'q', 'p', 'q'], 'D':[None, None, None, None, None]})
>>> df
   A  C     D  id
0  2  p  None   5
1  2  q  None   6
2  3  q  None   7
3  3  p  None   8
4  2  q  None   9

Expected behaviour when pivoting on the empty column:

>>> df.pivot_table(index='A', columns='D', values='id')
Empty DataFrame
Columns: []
Index: []

But if you specify an aggfunc it blows up:

>>> df.pivot_table(index='A', columns='D', values='id', aggfunc=np.size)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/virtualenv/default/lib/python2.7/site-packages/pandas/util/decorators.py", line 88, in wrapper
    return func(*args, **kwargs)
  File "/data/virtualenv/default/lib/python2.7/site-packages/pandas/util/decorators.py", line 88, in wrapper
    return func(*args, **kwargs)
  File "/data/virtualenv/default/lib/python2.7/site-packages/pandas/tools/pivot.py", line 151, in pivot_table
    table = table[values[0]]
  File "/data/virtualenv/default/lib/python2.7/site-packages/pandas/core/frame.py", line 1780, in __getitem__
    return self._getitem_column(key)
  File "/data/virtualenv/default/lib/python2.7/site-packages/pandas/core/frame.py", line 1787, in _getitem_column
    return self._get_item_cache(key)
  File "/data/virtualenv/default/lib/python2.7/site-packages/pandas/core/generic.py", line 1068, in _get_item_cache
    values = self._data.get(item)
  File "/data/virtualenv/default/lib/python2.7/site-packages/pandas/core/internals.py", line 2849, in get
    loc = self.items.get_loc(item)
  File "/data/virtualenv/default/lib/python2.7/site-packages/pandas/core/index.py", line 1402, in get_loc
    return self._engine.get_loc(_values_from_object(key))
  File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas/index.c:3812)
  File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3692)
  File "pandas/hashtable.pyx", line 696, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12299)
  File "pandas/hashtable.pyx", line 704, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12250)
KeyError: 'id'

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions