Skip to content

Indexing with namedtuple is broken #1026

Closed
@echlebek

Description

@echlebek

Although it is possible to index MultiIndexed DataFrames with multiple index columns, one or more of which have a compound type, it is not possible to index an Indexed DataFrame with a compound type for its column, nor is it possible to index a MultiIndexed Dataframe with a single column that has a compound type.

tl;dr - I can't index a DataFrame with a namedtuple, even though I can create one.

In the first example, I try to index a dataframe with a namedtuple with a regular Index, which fails.

In the second example, I index a dataframe with a tuple of namedtuples (MultiIndex), which succeeds.

In the third example, I try to index a dataframe with a length-1 tuple of namedtuples, again with a MultiIndex, which fails.

from collections import namedtuple
import pandas

# First example
""" 
>>> IndexType = namedtuple("IndexType", ["a", "b"])
>>> idx1 = IndexType("foo", "bar")
>>> idx2 = IndexType("baz", "bof")
>>> index = pandas.Index([idx1, idx2], name="composite_index")
>>> index
Index([IndexType(a='foo', b='bar'), IndexType(a='baz', b='bof')], dtype=object)
>>> df = pandas.DataFrame([(1, 2), (3, 4)], index=index, columns=["A", "B"])
>>> df
                             A  B
composite_index..................
IndexType(a='foo', b='bar')  1  2
IndexType(a='baz', b='bof')  3  4
>>> df.ix[IndexType("foo", "bar")]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    return self._getitem_tuple(key)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    return self._getitem_lowerdim(tup)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    section = self._getitem_axis(key, axis=i)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    return self._get_label(idx, axis=0)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    return self.obj.xs(label, axis=axis, copy=True)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    loc = self.index.get_loc(key)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    return self._engine.get_loc(key)
  File "engines.pyx", line 101, in pandas._engines.DictIndexEngine.get_loc (pandas/src/engines.c:2498)
  File "engines.pyx", line 108, in pandas._engines.DictIndexEngine.get_loc (pandas/src/engines.c:2460)
KeyError: 'foo'
""" 

# Second example

""" 
>>> mult_index = pandas.MultiIndex.from_tuples([(idx1, idx2)], names=["comp_1", "comp_2"])
>>> mult_index
MultiIndex([(IndexType(a='foo', b='bar'), IndexType(a='baz', b='bof'))], dtype=object)
>>> df = pandas.DataFrame([(1, 2, 3, 4)], index=mult_index, columns=["A", "B", "C", "D"])
>>> df
                                                         A  B  C  D
comp_1                      comp_2.................................
IndexType(a='foo', b='bar') IndexType(a='baz', b='bof')  1  2  3  4
>>> df.ix[(IndexType("foo", "bar"), IndexType("baz", "bof"))]
A    1   
B    2   
C    3   
D    4   
Name: (IndexType(a='foo', b='bar'), IndexType(a='baz', b='bof'))
""" 

# Third example

""" 
>>> index = pandas.MultiIndex.from_tuples([(IndexType("foo", "bar"),), (IndexType("baz", "bof"),)], names=["ind#
>>> index
Index([IndexType(a='foo', b='bar'), IndexType(a='baz', b='bof')], dtype=object
>>> df = pandas.DataFrame([(1, 2), (3, 4)], index=index, columns=["A", "B"])
>>> df
                             A  B
index............................
IndexType(a='foo', b='bar')  1  2
IndexType(a='baz', b='bof')  3  4
>>> df.ix[IndexType("foo", "bar")]
Traceback (most recent call last):
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    return self._getitem_tuple(key)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    return self._getitem_lowerdim(tup)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    section = self._getitem_axis(key, axis=i)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    return self._get_label(idx, axis=0)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    return self.obj.xs(label, axis=axis, copy=True)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    loc = self.index.get_loc(key)
  File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
    return self._engine.get_loc(key)
  File "engines.pyx", line 101, in pandas._engines.DictIndexEngine.get_loc (pandas/src/engines.c:2498)
  File "engines.pyx", line 108, in pandas._engines.DictIndexEngine.get_loc (pandas/src/engines.c:2460)
KeyError: 'foo'
>>> df.ix[(IndexType("foo", "bar"),)]
      A   B
foo NaN NaN
bar NaN NaN
"""

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions