Skip to content

False positive SettingWithCopyWarning when just taking subset of columns #16550

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

From an example of @glemaitre, it was a simple case of taking a subset of the columns and then further working with it and then raises a warning (a case where I would have thought pandas should be able to detect that it is not needed to raise a warning)

After some experimenting, it seems that it is only triggerd when the frame is first printed:

Simple case raising the false positive warning:

In [47]: df = pd.DataFrame(np.random.randn(5, 5), columns=list('ABCDE'))

In [48]: df
Out[48]: 
          A         B         C         D         E
0  0.101315 -0.940874  0.848323 -1.114318  0.093271
1  0.085363  0.201148  0.852091 -0.000424 -0.490293
2 -0.227004 -0.882167 -0.153934  0.679528  2.049475
3  0.977241 -0.661771  1.367731 -0.675444  0.544696
4 -1.347269  1.286316 -0.742564  1.247596 -0.100017

In [49]: df = df[['A', 'B', 'C']]

In [50]: df['new'] = [1, 2, 3, 4, 5]
/home/joris/miniconda3/envs/dev/bin/ipython:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  #!/home/joris/miniconda3/envs/dev/bin/python

but when not displaying the frame, it does not give a warning:

In [44]: df = pd.DataFrame(np.random.randn(5, 5), columns=list('ABCDE'))

In [45]: df = df[['A', 'B', 'C']]

In [46]: df['new'] = [1, 2, 3, 4, 5]

This is on master:

``` In [51]: pd.show_versions() /home/joris/miniconda3/envs/dev/lib/python3.5/site-packages/xarray/core/formatting.py:16: FutureWarning: The pandas.tslib module is deprecated and will be removed in a future version. from pandas.tslib import OutOfBoundsDatetime

INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-78-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.0.dev+96.gef487d9
pytest: 3.0.3
pip: 9.0.1
setuptools: 34.2.0
Cython: 0.24.1
numpy: 1.11.3
scipy: 0.18.1
xarray: 0.9.5
IPython: 6.0.0
sphinx: 1.5.2
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.0.0
tables: 3.3.0
numexpr: 2.6.2
feather: 0.3.1
matplotlib: 2.0.2
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.3
lxml: None
bs4: 4.5.3
html5lib: 0.9999999
sqlalchemy: 1.0.13
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.9.5
s3fs: 0.0.7
pandas_gbq: None
pandas_datareader: None

</details>

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugCopy / view semanticsError ReportingIncorrect or improved errors from pandasIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions