Skip to content

BUG: A read-only DataFrame cannot be .diff()'ed #35559

Closed
@dycw

Description

@dycw
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import numpy as np
import pandas as pd

data = np.ones(2, dtype=int)
data.flags.writeable = False
df = pd.DataFrame(data)
df.diff()

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-df60a870b020> in <module>
      2 data.flags.writeable = False
      3 df = pd.DataFrame(data)
----> 4 df.diff()

~/miniconda3/envs/dts/lib/python3.8/site-packages/pandas/core/frame.py in diff(self, periods, axis)
   7247             return self.T.diff(periods, axis=0).T
   7248 
-> 7249         new_data = self._mgr.diff(n=periods, axis=bm_axis)
   7250         return self._constructor(new_data)
   7251 

~/miniconda3/envs/dts/lib/python3.8/site-packages/pandas/core/internals/managers.py in diff(self, n, axis)
    546 
    547     def diff(self, n: int, axis: int) -> "BlockManager":
--> 548         return self.apply("diff", n=n, axis=axis)
    549 
    550     def interpolate(self, **kwargs) -> "BlockManager":

~/miniconda3/envs/dts/lib/python3.8/site-packages/pandas/core/internals/managers.py in apply(self, f, align_keys, **kwargs)
    394                 applied = b.apply(f, **kwargs)
    395             else:
--> 396                 applied = getattr(b, f)(**kwargs)
    397             result_blocks = _extend_blocks(applied, result_blocks)
    398 

~/miniconda3/envs/dts/lib/python3.8/site-packages/pandas/core/internals/blocks.py in diff(self, n, axis)
   1265     def diff(self, n: int, axis: int = 1) -> List["Block"]:
   1266         """ return block for the diff of the values """
-> 1267         new_values = algos.diff(self.values, n, axis=axis, stacklevel=7)
   1268         return [self.make_block(values=new_values)]
   1269 

~/miniconda3/envs/dts/lib/python3.8/site-packages/pandas/core/algorithms.py in diff(arr, n, axis, stacklevel)
   1914         # TODO: can diff_2d dtype specialization troubles be fixed by defining
   1915         #  out_arr inside diff_2d?
-> 1916         algos.diff_2d(arr, out_arr, n, axis)
   1917     else:
   1918         # To keep mypy happy, _res_indexer is a list while res_indexer is

pandas/_libs/algos.pyx in pandas._libs.algos.diff_2d()

~/miniconda3/envs/dts/lib/python3.8/site-packages/pandas/_libs/algos.cpython-38-x86_64-linux-gnu.so in View.MemoryView.memoryview_cwrapper()

~/miniconda3/envs/dts/lib/python3.8/site-packages/pandas/_libs/algos.cpython-38-x86_64-linux-gnu.so in View.MemoryView.memoryview.__cinit__()

ValueError: buffer source array is read-only

Problem description

df.diff() does not seem like a data-mutating operation, at least not to me. My read is that df.iloc[] was given the same assessment in 2015 (#10043).

Expected Output

        0
0     nan
1 0.00000

Output of pd.show_versions()

``` INSTALLED VERSIONS ------------------ commit : d9fff27 python : 3.8.3.final.0 python-bits : 64 OS : Linux OS-release : 5.4.0-42-generic Version : #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_HK.UTF-8 LOCALE : en_HK.UTF-8

pandas : 1.1.0
numpy : 1.19.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.1.1
setuptools : 49.2.0.post20200714
Cython : None
pytest : 6.0.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : 2.8.5 (dt dec pq3 ext lo64)
jinja2 : 2.11.2
IPython : 7.16.1
pandas_datareader: None
bs4 : 4.9.1
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.2.2
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.0
sqlalchemy : 1.3.18
tables : None
tabulate : 0.8.3
xarray : None
xlrd : None
xlwt : None
numba : None

</details>

Metadata

Metadata

Labels

AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffRegressionFunctionality that used to work in a prior pandas version

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions