Skip to content

pd.concat with copy=True doesn't copy columns index #29879

Closed
@mwtoews

Description

@mwtoews

Code Sample

import pandas as pd
print(pd.__version__)  # '0.25.3'

# simple frame with a name attribute for columns
df = pd.DataFrame({'a': [1.1], 'b': [2.2]})
df.columns.name = 'x'
print(df)
# x    a    b
# 0  1.1  2.2

# tile rows to a new frame, apparently using the copy of the first frame
df2 = pd.concat([df] * 2, copy=True)
print(df2)
# x    a    b
# 0  1.1  2.2
# 0  1.1  2.2

# clear the column name for the first frame
df.columns.name = None

# and observe the name for the second
assert df2.columns.name is None
print(df2)
#      a    b
# 0  1.1  2.2
# 0  1.1  2.2

# aha, it's the same object
assert df.columns is df2.columns

Problem description

It is expected that copy=True should make a copy of the columns index. However, the demonstration shows it's the same object. Modifications, such as the name property, are made to the source frame and the output frame created by pd.concat.

Expected Output

It is expected that df2.columns is a copy of the original df.columns. This would prevent (e.g.) changing the name property for both frames.

A workaround is to manually copy this property after pd.concat:

df2 = pd.concat([df] * 2, copy=True)
df2.columns = df2.columns.copy()

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : None python : 3.7.5.final.0 python-bits : 64 OS : Windows OS-release : 10 machine : AMD64 processor : Intel64 Family 6 Model 63 Stepping 2, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : None.None

pandas : 0.25.3
numpy : 1.17.3
pytz : 2019.3
dateutil : 2.8.1
pip : 19.3.1
setuptools : 42.0.1.post20191125
Cython : 0.29.14
pytest : 5.3.0
hypothesis : None
sphinx : 2.2.1
blosc : None
feather : None
xlsxwriter : 1.2.6
lxml.etree : 4.4.1
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.9.0
pandas_datareader: None
bs4 : 4.8.1
bottleneck : 1.3.1
fastparquet : None
gcsfs : None
lxml.etree : 4.4.1
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : 3.0.1
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.1
sqlalchemy : 1.3.11
tables : 3.6.1
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions