Skip to content

Selecting multi rows by string in a DatetimeIndexed DataFrame #16710

Closed
@Yevgnen

Description

@Yevgnen

Code Sample, a copy-pastable example if possible

    import numpy as np
    import pandas as pd

    rows = 10

    persons = [
        {
            'name': ''.join([
                np.random.choice([chr(x) for x in range(97, 97 + 26)])
                for i in range(5)
            ]),
            'sex': np.random.choice(['male', 'female']),
            'age': np.random.randint(10, 100),
            'job': np.random.choice(['staff', 'cook', 'student']),
            'birthday': np.random.choice(pd.date_range('1990-01-01', '2010-01-01')),
            'hobby': np.random.choice(['cs', 'war3', 'dota'])
        }
        for i in range(rows)
    ]

    df = pd.DataFrame(persons)
    df_indexed = df.set_index('birthday')
    df_indexed.index = pd.date_range('2010-01-01', '2010-01-10')
    # Works well on single row
    df_indexed.loc['2010-01-01', 'age': 'sex'] 

    # Works well on slicing
    df_indexed.loc['2010-01-01': '2010-01-05', 'age': 'sex']

    # Works well on slicing 
    df_indexed.loc[slice('2010-01-01', '2010-01-05'), 'age': 'sex']

    # ERROR!
    df_indexed.loc[['2010-01-01', '2010-01-05'], ['age', 'sex']]

    # Works well if convert the string to datetime
    df_indexed.loc[[pd.to_datetime('2010-01-01'), pd.to_datetime('2010-01-05')], ['age', 'sex']]

Problem description

strings are not convert to datetime when selecting multi-rows in a DataFrame whose index is DatetimeIndex. Since it works in selecting single row and slice object, I personally think it would work in list of strings.

Expected Output

Output of pd.show_versions()

/usr/local/lib/python3.6/site-packages/xarray/core/formatting.py:16: FutureWarning: The pandas.tslib module is deprecated and will be removed in a future version. from pandas.tslib import OutOfBoundsDatetime

INSTALLED VERSIONS

commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Darwin
OS-release: 16.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 35.0.2
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: 0.9.5
IPython: 6.0.0
sphinx: 1.5.5
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.0
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: 3.7.3
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: 1.1.9
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    DatetimeDatetime data dtypeIndexingRelated to indexing on series/frames, not to indexes themselvesNeeds TestsUnit test(s) needed to prevent regressionsgood first issue

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions