Skip to content

BUG: multiindex slicers edge cases #8132

Closed
@jreback

Description

@jreback

from SO

df = pd.DataFrame({'A': ['A0'] * 5 + ['A1']*5 + ['A2']*5,
            'B': ['B0','B0','B1','B1','B2'] * 3,
            'DATE': ["2013-06-11",
                    "2013-07-02",
                    "2013-07-09",
                    "2013-07-30",
                    "2013-08-06",
                    "2013-06-11",
                    "2013-07-02",
                    "2013-07-09",
                    "2013-07-30",
                    "2013-08-06",
                    "2013-09-03",
                    "2013-10-01",
                    "2013-07-09",
                    "2013-08-06",
                    "2013-09-03"],
             'VALUES': [22, 35, 14,  9,  4, 40, 18, 4, 2, 5, 1, 2, 3,4, 2]})

df.DATE = df['DATE'].apply(lambda x: pd.to_datetime(x))

df1 = df.set_index(['A', 'B', 'DATE'])
df1 = df1.sortlevel()

df2 = df.set_index('DATE')

# A1 - Works - Get all values under "A0" and "A1"
df1.loc[(slice('A1')),:]

# A2 - Works - Get all values from the start to "A2"
df1.loc[(slice('A2')),:]

# A3 - Works - Get all values under "B1" or "B2"
df1.loc[(slice(None),slice('B1','B2')),:]

# A4 - Works - Get all values between 2013-07-02 and 2013-07-09
df1.loc[(slice(None),slice(None),slice('20130702','20130709')),:]

##############################################
# These do not work and I'm wondering why... #
##############################################

# B1 - Does not work - Get all values in B0 that are also under A0, A1 and A2
df1.loc[(slice('A2'),slice('B0')),:]

# B2 - Does not work - Get all values in B0, B1 and B2 (similar to what #2 is doing for the As)
df1.loc[(slice(None),slice('B2')),:]

# B3 - Does not work - Get all values from B1 to B2 and up to 2013-08-06
df1.loc[(slice(None),slice('B1','B2'),slice('2013-08-06')),:]

# B4 - Does not work - Same as A4 but the start of the date slice is not a key.
#                      Would have thought the behavior would be similar to something like df2['20130701':]
#                      In other words, date indexing allowed starting on non-key points
df1.loc[(slice(None),slice(None),slice('20130701','20130709')),:]

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexingRelated to indexing on series/frames, not to indexes themselvesMultiIndex

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions