Skip to content

Error when adding 'm' to a Series of strings #28658

Closed
@capelastegui

Description

@capelastegui

Code Sample

I have found a strange corner case when adding strings to a Series. This is an operation that most of the times works well:

import pandas as pd

# Adding a string to a Series, works as expected
pd.Series(['hello','world'])+'!'

output:

0    hello!
1    world!
dtype: object

But in the specific case of adding the string 'm', it raises an exception:

# Adding the string 'm' to a Series
pd.Series(['hello','world'])+'m'

output:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-125-795dd84b51d1> in <module>
----> 1 pd.Series(['hello','world'])+'m'

~/PycharmProjects/nsa.notebooks-voice/virtual3/lib/python3.6/site-packages/pandas-0.25.1-py3.6-macosx-10.12-x86_64.egg/pandas/core/ops/__init__.py in wrapper(left, right)
   1014             if is_scalar(right):
   1015                 # broadcast and wrap in a TimedeltaIndex
-> 1016                 assert np.isnat(right)
   1017                 right = np.broadcast_to(right, left.shape)
   1018                 right = pd.TimedeltaIndex(right)

TypeError: ufunc 'isnat' is only defined for datetime and timedelta.

Problem description

It took me a while to figure this one out, but looking at the source code of _arith_method_SERIES() in pandas/pandas/core/ops/init.py, it's apparently due to running is_timedelta64_dtype() as part of type checks for the scalar. String 'm' is incorrectly parsed as a timedelta64.

pd.core.ops.is_timedelta64_dtype('m')
>True

This can be traced to a pandas_dtype() function:

pd.core.dtypes.common.pandas_dtype('m')
>dtype('<m8')

Presumably there are other random strings with similar interactions with type checking functions that throw similar errors in this operation.

I get this error on build 25.1 . I haven't been able to test this on the latest master code, which introduces a lot of changes to /pandas/core/ops/init , so I don't know if this is still a problem there.

Expected Output

Adding strings to a string Series shouldn't throw this exception.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.6.6.final.0
python-bits : 64
OS : Darwin
OS-release : 18.6.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 0.25.1
numpy : 1.17.2
pytz : 2019.2
dateutil : 2.8.0
pip : 10.0.1
setuptools : 39.0.1
Cython : None
pytest : 5.1.2
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : 2.8.3 (dt dec pq3 ext lo64)
jinja2 : 2.10.1
IPython : 7.8.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.14.1
pytables : None
s3fs : None
scipy : 1.3.1
sqlalchemy : 1.3.8
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions