Skip to content

BUG: PeriodIndex construction fails (or produces erroneous result) with memoryview as input. #13120

Closed
@ssanderson

Description

@ssanderson

Code Sample, a copy-pastable example if possible

First noted in discussion here: https://github.com/quantopian/zipline/pull/1190/files/fe5a2a888a498d838c3cc43de2c33ac08b20d2a7#r62506342. This most often matters when passing a value typed in Cython as something like int64_t[:].

Simple repro cases:

DatetimeIndex construction fails (looks like trying to parse strings?), and vanilla Index construction returns an array of unicode strings.

In [6]: import pandas as pd; import numpy as np

In [7]: pd.__version__, np.__version__
Out[7]: ('0.18.1+20.gaf7bdd3', '1.11.0')

In [8]: m = memoryview(np.arange(5))

In [9]: pd.DatetimeIndex(m)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-09b855a663a4> in <module>()
----> 1 pd.DatetimeIndex(m)

/home/ssanderson/clones/pandas/pandas/util/decorators.pyc in wrapper(*args, **kwargs)
     89                 else:
     90                     kwargs[new_arg_name] = new_arg_value
---> 91             return func(*args, **kwargs)
     92         return wrapper
     93     return _deprecate_kwarg

/home/ssanderson/clones/pandas/pandas/tseries/index.pyc in __new__(cls, data, freq, start, end, periods, copy, name, tz, verify_integrity, normalize, closed, ambiguous, dtype, **kwargs)
    283                 data = tslib.parse_str_array_to_datetime(data, freq=freq,
    284                                                          dayfirst=dayfirst,
--> 285                                                          yearfirst=yearfirst)
    286             else:
    287                 data = tools.to_datetime(data, errors='raise')

/home/ssanderson/clones/pandas/pandas/tslib.pyx in pandas.tslib.parse_str_array_to_datetime (pandas/tslib.c:39610)()
   2175                         except ValueError:
   2176                             if is_coerce:
-> 2177                                 iresult[i] = NPY_NAT
   2178                                 continue
   2179                             raise

ValueError:

In [10]: pd.Index(m)
Out[10]: Index([u'', u'', u'', u'', u''], dtype='object')

Expected Output

I'd expect this to either error immediately or provide the same result as coercing the provided memoryview into a numpy array.

output of pd.show_versions()

In [6]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: af7bdd3883c8d61e9d9388d3aa699930eee7fff8
python: 2.7.10.final.0
python-bits: 64
OS: Linux
OS-release: 4.2.0-16-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1+20.gaf7bdd3
nose: None
pip: 8.1.0
setuptools: 20.2.2
Cython: 0.24
numpy: 1.11.0
scipy: None
statsmodels: None
xarray: None
IPython: 4.2.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.3
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None

cc @jbredeche

Metadata

Metadata

Assignees

No one assigned

    Labels

    ConstructorsSeries/DataFrame/Index/pd.array ConstructorsNeeds TestsUnit test(s) needed to prevent regressionsgood first issue

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions