Description
Code Sample, a copy-pastable example if possible
Sadly, pseudocode because the underlying container is proprietary, but this is a generic problem. If I get a chance this weekend I will write up a mock open-source-compliant example for illustration and testing.
class MyArray(ExtensionArray):
def __init__(self, values, **kwargs):
# All the things
self.values = MyNotNumpyContainer(values)
# All the other methods
df = pd.DataFrame({'mycolumn': MyArray(values)})
# Raises AttributeError: 'MyNotNumpyContainer' object has no attribute 'dtype'
Problem description
There is a check at the top of _isna_ndarraylike()
:
def _isna_ndarraylike(obj):
values = getattr(obj, 'values', obj)
dtype = values.dtype
if is_extension_array_dtype(obj):
if isinstance(obj, (ABCIndexClass, ABCSeries)):
values = obj._values
else:
values = obj
result = values.isna()
This fails for ExtensionArray
objects which define a values
attribute, but whose values
attribute does not have a dtype
attribute.
I noticed this call in pd.concat
above, but no doubt it occurs elsewhere.
Expected Output
Since dtype
is a required attribute of an ExtensionArray
, _isna_ndarraylike()
should at least get the dtype from the ExtensionArray
class. Really, a check for whether we are dealing with an ExtensionArray
should occur upstream somewhere, since there is no guarantee an ExtensionArray is backended by a numpy array-like object.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-1048-aws
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.23.0.dev0+762.gbb095a6
pytest: 3.2.2
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.28.2
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 5.3.0
sphinx: 1.5.4
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.1.9
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None