Skip to content

ENH: Add Float128 support for groupby. #59483

Open
@cemde

Description

@cemde

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

I wish to use groupby()[].mean() on a column with np.float128. However, that is not supported.

Feature Description

import numpy as np
import pandas as pd

# Set the random seed for reproducibility
np.random.seed(42)

# Generate binary data for columns A and B
data_A = np.random.randint(0, 2, size=1000)
data_B = np.random.randint(0, 2, size=1000)

# Generate lognormal random values for column C with np.float128 precision
data_C = np.random.lognormal(mean=0, sigma=1, size=1000).astype(np.float128)

# Create the DataFrame
df = pd.DataFrame({
    "A": data_A,
    "B": data_B,
    "C": data_C
})

# Group by columns A and B and get the mean of column C
grouped_means = df.groupby(["A", "B"])["C"].mean()

grouped_means

results in this error message:

  File "/homes/cornelius/numpy128.py", line 22, in <module>
    grouped_means = df.groupby(["A", "B"])["C"].mean()
  File "/homes/cornelius/anaconda3/envs/general/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 2375, in mean
    result = self._cython_agg_general(
  File "/homes/cornelius/anaconda3/envs/general/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 1926, in _cython_agg_general
    new_mgr = data.grouped_reduce(array_func)
  File "/homes/cornelius/anaconda3/envs/general/lib/python3.10/site-packages/pandas/core/internals/base.py", line 336, in grouped_reduce
    res = func(arr)
  File "/homes/cornelius/anaconda3/envs/general/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 1902, in array_func
    result = self.grouper._cython_operation(
  File "/homes/cornelius/anaconda3/envs/general/lib/python3.10/site-packages/pandas/core/groupby/ops.py", line 815, in _cython_operation
    return cy_op.cython_operation(
  File "/homes/cornelius/anaconda3/envs/general/lib/python3.10/site-packages/pandas/core/groupby/ops.py", line 534, in cython_operation
    return self._cython_op_ndim_compat(
  File "/homes/cornelius/anaconda3/envs/general/lib/python3.10/site-packages/pandas/core/groupby/ops.py", line 323, in _cython_op_ndim_compat
    res = self._call_cython_op(
  File "/homes/cornelius/anaconda3/envs/general/lib/python3.10/site-packages/pandas/core/groupby/ops.py", line 403, in _call_cython_op
    func(
  File "groupby.pyx", line 989, in pandas._libs.groupby.__pyx_fused_cpdef
TypeError: No matching signature found

Alternative Solutions

Converting to numpy and back to pandas can work. But this is not as pleasant as pandas.

Additional Context

Environment:
OS: CentOS Linux 8

python 3.10.13
pandas 2.1.1
numpy 1.24.1

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions