Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
As also reported here: dask/dask#7610, PR #38671 introduced the following behaviour:
A reproducer with just pandas:
df1 = pd.DataFrame([[1, 2]], columns=pd.MultiIndex.from_tuples([('B', 1), ('C', 1)])) df2 = pd.DataFrame(index=[0], columns=pd.RangeIndex(0)) pd.concat([df1, df2])
What triggers the error here is the
columns=pd.RangeIndex(0)
for the emptydf2
(by default pandas creates a zero-length object dtype Index, which works fine, but if it's an empty RangeIndex, we now get this error).This is a regression in itself in pandas. But I am also wondering a bit where the empty RangeIndex is coming from (it seems that the groupby operation in dask results in some empty partitions)
Originally posted by @jorisvandenbossche in dask/dask#7610 (comment)
IIUC the fix for this could possibly be as simple as replacing
pandas/pandas/core/indexes/base.py
Lines 2926 to 2928 in 0acbff8
with
if isinstance(self, ABCMultiIndex) and not is_object_dtype(
unpack_nested_dtype(other)
) and len(other) > 0: