Skip to content

BUG(?): ArrowExtensionArray.median is an "approximate median" #52679

Closed
@lukemanley

Description

@lukemanley

ArrowExtensionArray.median is implemented using pyarrow.compute.approximate_median which might produce surprising results for users looking for a "true" median:

In [1]: import pandas as pd

In [2]: pd.Series([1, 2], dtype="float64").median()
Out[2]: 1.5

In [3]: pd.Series([1, 2], dtype="float64[pyarrow]").median()
Out[3]: 1.0

It does not look like pyarrow provides a compute function for a "true" median. Is this intentional? Should it be documented somehow?

cc @mroeschke

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions