Skip to content

BUG: DataFrame[Sparse] quantile fails because SparseArray has no reshape  #24600

Closed
@jbrockmendel

Description

@jbrockmendel

Tried to simplify Block.quantile by arranging for it to only have to handle 2D case by having Series.quantile dispatch to DataFrame implementation. Ended up getting failures in pandas/tests/series/test_quantile.py test_quantile_sparse

ser = pd.Series([0., None, 1., 2.], dtype='Sparse[float]')
df = pd.DataFrame(ser)

>>> ser.quantile(0.5)
1.0
>>> ser.quantile([0.5])
0.5    1.0
dtype: float64
>>> df.quantile(0.5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas/core/frame.py", line 7760, in quantile
    transposed=is_transposed)
  File "pandas/core/internals/managers.py", line 500, in quantile
    return self.reduction('quantile', **kwargs)
  File "pandas/core/internals/managers.py", line 432, in reduction
    axe, block = getattr(b, f)(axis=axis, axes=self.axes, **kwargs)
  File "pandas/core/internals/blocks.py", line 1530, in quantile
    result = _nanpercentile(values, qs * 100, axis=axis, **kw)
  File "pandas/core/internals/blocks.py", line 1484, in _nanpercentile
    mask = mask.reshape(values.shape)
AttributeError: 'SparseArray' object has no attribute 'reshape'
>>> df.quantile([0.5])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas/core/frame.py", line 7760, in quantile
    transposed=is_transposed)
  File "pandas/core/internals/managers.py", line 500, in quantile
    return self.reduction('quantile', **kwargs)
  File "pandas/core/internals/managers.py", line 432, in reduction
    axe, block = getattr(b, f)(axis=axis, axes=self.axes, **kwargs)
  File "pandas/core/internals/blocks.py", line 1511, in quantile
    axis=axis, **kw)
  File "pandas/core/internals/blocks.py", line 1484, in _nanpercentile
    mask = mask.reshape(values.shape)
AttributeError: 'SparseArray' object has no attribute 'reshape'

datetime64[ns, tz] breaks in a slightly different way (presumably all ExtensionBlocks will fail):

dti = pd.date_range('2016-01-01', periods=3, tz='US/Pacific')

ser = pd.Series(dti)
df = pd.DataFrame(ser)

>>> df.quantile(0.5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas/core/frame.py", line 7760, in quantile
    transposed=is_transposed)
  File "pandas/core/internals/managers.py", line 500, in quantile
    return self.reduction('quantile', **kwargs)
  File "pandas/core/internals/managers.py", line 473, in reduction
    values = _concat._concat_compat([b.values for b in blocks])
  File "pandas/core/dtypes/concat.py", line 174, in _concat_compat
    return np.concatenate(to_concat, axis=axis)
ValueError: need at least one array to concatenate
>>> df.quantile([0.5])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas/core/frame.py", line 7760, in quantile
    transposed=is_transposed)
  File "pandas/core/internals/managers.py", line 500, in quantile
    return self.reduction('quantile', **kwargs)
  File "pandas/core/internals/managers.py", line 473, in reduction
    values = _concat._concat_compat([b.values for b in blocks])
  File "pandas/core/dtypes/concat.py", line 174, in _concat_compat
    return np.concatenate(to_concat, axis=axis)
ValueError: need at least one array to concatenate

xref #24583

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions