Skip to content

ERR: pivot_table when number of levels larger than int32 range #20601

Closed
@mklwong

Description

@mklwong

Code Sample, a copy-pastable example if possible

import pandas as pd
dat = pd.DataFrame({'ind1':list(range(1337600))*2,'ind2':list(range(3040))*2*440,'count':[1]*2*1337600})
dat.pivot_table(index='ind1',columns='ind2',values='count',aggfunc='count')

Problem description

Above code raises the following error:

  File "..\pandas\core\reshape\reshape.py", line 144, in _make_selectors
    mask = np.zeros(np.prod(self.full_shape), dtype=bool)

ValueError: negative dimensions are not allowed

np.prod(self.full_shape) appears to be returning a negative value because the number of unique index combinations is larger than the largest int32 value.

If line 144 were changed to the following, the issue could be fixed:

mask = np.zeros(np.prod(self.full_shape,dtype=np.int64), dtype=bool)

Expected Output

1337600 x 3040 dataframe.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 37 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en
LOCALE: None.None

pandas: 0.22.0
pytest: 3.3.2
pip: 9.0.1
setuptools: 38.4.0
Cython: 0.27.3
numpy: 1.12.1
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.6
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.1.2
openpyxl: 2.4.10
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.1
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffError ReportingIncorrect or improved errors from pandasNumeric OperationsArithmetic, Comparison, and Logical operationsgood first issue

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions