Skip to content

Random Samplers fail for Nested Numpy Arrays #605

Closed
@dgrahn

Description

@dgrahn

Description

I'm getting the following error when attempting to produce use the Random Over Sampler.

  File ".../lib/python3.7/site-packages/imblearn/base.py", line 79, in fit_resample
    X, y, binarize_y = self._check_X_y(X, y)
  File ".../lib/python3.7/site-packages/imblearn/over_sampling/_random_over_sampler.py", line 94, in _check_X_y
    X, y = check_X_y(X, y, accept_sparse=['csr', 'csc'], dtype=None)
  File ".../lib/python3.7/site-packages/sklearn/utils/validation.py", line 719, in check_X_y
    estimator=estimator)
  File ".../lib/python3.7/site-packages/sklearn/utils/validation.py", line 542, in check_array
    allow_nan=force_all_finite == 'allow-nan')
  File ".../lib/python3.7/site-packages/sklearn/utils/validation.py", line 59, in _assert_all_finite
    if _object_dtype_isnan(X).any():
AttributeError: 'bool' object has no attribute 'any'

This happened right after I updated sklearn.

Steps/Code to Reproduce

Both over and under sample fail.

from imblearn import over_sampling, under_sampling
import numpy  as np
import pandas as pd

df = pd.DataFrame({
    'val': [
        np.zeros((1, 2)),
        np.zeros((3, 4)),
        np.zeros((5, 6)),
    ]
})

x = df.values
y = np.array([ 0, 0, 1])

sampler = over_sampling.RandomOverSampler(random_state=42)
# sampler = under_sampling.RandomUnderSampler(random_state=42)
sampler.fit_resample(x, y)

Versions

Linux-3.10.0-957.21.3.el7.x86_64-x86_64-with-centos-7.6.1810-Core
Python 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0]
NumPy 1.16.5
SciPy 1.3.1
Scikit-Learn 0.21.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: BugIndicates an unexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions