Skip to content

length_of_indexer method is not support boolean array indexer. #25774

Closed
@fx-kirin

Description

@fx-kirin

Code Sample

import numpy as np
import pandas as pd
import random

# Nothing Raises
df = pd.DataFrame(dict(A=[1, 1, 1, 1, 1], B=[1., 1., 1., 1., 1.], C=[1, 1, 1, 1, 1]))
df.index = [0, 0, 1, 3, 5]
df.index.name = 'test'
df.loc[0, 'A'] = df.loc[0, 'A']

# ValueError raises
length = 20000
df = pd.DataFrame(dict(A=[1 for _ in range(length)], B=[1. for _ in range(length)], C=[1 for _ in range(length)]))
index = []
for _ in range(length):
    index.append(random.randint(0, 2000))
df.index = index
df.loc[0, 'A'] = df.loc[0, 'A']

Problem description

lplane_indexer = length_of_indexer(plane_indexer[0],

def can_do_equal_len():
""" return True if we have an equal len settable """
if (not len(labels) == 1 or not np.iterable(value) or
is_scalar(plane_indexer[0])):
return False
item = labels[0]
index = self.obj[item].index
values_len = len(value)
# equal len list/ndarray
if len(index) == values_len:
return True
elif lplane_indexer == values_len:
return True
return False

When the DataFrame.loc method uses boolean array selector in the backgroud, the variable lplane_indexer is not the same value as the number of True value in the array but the size of DataFrame. It didn't happen in a former version.

Expected Output

No error raises.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-46-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.23.3
pytest: 3.7.3
pip: 18.1
setuptools: 36.5.0.post20170921
Cython: 0.28.4
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 7.2.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 3.0.2
openpyxl: None
xlrd: 1.2.0
xlwt: None
xlsxwriter: 1.1.5
lxml: None
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.10
pymysql: 0.9.3
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    IndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions