Skip to content

column type changes when assigning with loc and values #26779

Closed
@blinkseb

Description

@blinkseb

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy

In [34]: df = pd.DataFrame( 
    ...:     data={ 
    ...:         'channel': [1, 2, 3], 
    ...:         'A': ['String 1', numpy.NaN, 'String 2'], 
    ...:         'B': [pd.Timestamp('2019-06-11 11:00:00'), numpy.NaN, pd.Timestamp('2019-06-11 12:00:00')] 
    ...:     } 
    ...: )                                                                                                                                                                                    

In [35]: df                                                                                                                                                                                   
Out[35]: 
   channel         A                   B
0        1  String 1 2019-06-11 11:00:00
1        2       NaN                 NaT
2        3  String 2 2019-06-11 12:00:00

In [36]: df2 = pd.DataFrame( 
    ...:     data={'A': ['String 3'], 'B': [pd.Timestamp('2019-06-11 12:00:00')]} 
    ...: )                                                                                                                                                                                    

In [37]: df2                                                                                                                                                                                  
Out[37]: 
          A                   B
0  String 3 2019-06-11 12:00:00

In [38]: df.loc[df['A'].isna() , ['A', 'B']] = df2.values                                                                                                                                     

In [39]: df                                                                                                                                                                                   
Out[39]: 
   channel         A                    B
0        1  String 1  1560250800000000000
1        2  String 3  2019-06-11 12:00:00
2        3  String 2  1560254400000000000

Problem description

Hello,

Assigning new values to rows using.loc and .values change the type of the column (look how column B, which was a datetime64, is now a mix of integer and datetime)

Note: the same exact code produces the expected output when using pandas 0.22.0, 0.23.4 but broked in pandas 0.24.0 ; also, if I remove the A column, the output is correct (no unwanted datetime to integer conversion)

Thanks for your help!

Expected Output

Out[39]: 
   channel         A                    B
0        1  String 1  2019-06-11 11:00:00
1        2  String 3  2019-06-11 12:00:00
2        3  String 2  2019-06-11 12:00:00

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.8.final.0
python-bits: 64
OS: Linux
OS-release: 5.1.7-300.fc30.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8

pandas: 0.24.2
pytest: None
pip: 19.1.1
setuptools: 41.0.1
Cython: None
numpy: 1.16.4
scipy: None
pyarrow: None
xarray: None
IPython: 7.5.0
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

Labels

IndexingRelated to indexing on series/frames, not to indexes themselvesNeeds TestsUnit test(s) needed to prevent regressionsgood first issue

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions