Skip to content

BUG: df.loc[[x], :] fails if df has zero rows #41170

Closed
@RagBlufThim

Description

@RagBlufThim
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

import pandas as pd
df = pd.DataFrame({"A": [12,23,34,45]}, index = [list("aabb"), [0,1,2,3]])
print(df)
print("- - -")
print(df.loc[df.A < 30, :].loc[["b"], :])  # empty as expected
print("- - -")
print(df.loc[df.A < 10, :].loc[["b"], :])  # raises ValueError

Complete Output of Code Sample

      A
a 0  12
  1  23
b 2  34
  3  45
- - -
Empty DataFrame
Columns: [A]
Index: []
- - -
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\test\venv124\lib\site-packages\pandas\core\indexing.py", line 889, in __getitem__
    return self._getitem_tuple(key)
  File "C:\test\venv124\lib\site-packages\pandas\core\indexing.py", line 1060, in _getitem_tuple
    return self._getitem_lowerdim(tup)
  File "C:\test\venv124\lib\site-packages\pandas\core\indexing.py", line 791, in _getitem_lowerdim
    return self._getitem_nested_tuple(tup)
  File "C:\test\venv124\lib\site-packages\pandas\core\indexing.py", line 865, in _getitem_nested_tuple
    obj = getattr(obj, self.name)._getitem_axis(key, axis=axis)
  File "C:\test\venv124\lib\site-packages\pandas\core\indexing.py", line 1113, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
  File "C:\test\venv124\lib\site-packages\pandas\core\indexing.py", line 1053, in _getitem_iterable
    keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
  File "C:\test\venv124\lib\site-packages\pandas\core\indexing.py", line 1254, in _get_listlike_indexer
    indexer, keyarr = ax._convert_listlike_indexer(key)
  File "C:\test\venv124\lib\site-packages\pandas\core\indexes\multi.py", line 2559, in _convert_listlike_indexer
    _, indexer = self.reindex(keyarr, level=level)
  File "C:\test\venv124\lib\site-packages\pandas\core\indexes\multi.py", line 2470, in reindex
    target, indexer, _ = self._join_level(
  File "C:\test\venv124\lib\site-packages\pandas\core\indexes\base.py", line 3924, in _join_level
    ngroups = 1 + new_lev_codes.max()
  File "C:\test\venv124\lib\site-packages\numpy\core\_methods.py", line 39, in _amax
    return umr_maximum(a, axis, None, out, keepdims, initial, where)
ValueError: zero-size array to reduction operation maximum which has no identity

Problem description

In both cases the df.loc[df.A...] returns a dataframe that doesn't contain any rows with an index value of "b".
Accordingly in the first case the result of .loc[["b"], :] is an empty dataframe, but in the second case a ValueError is raised. The difference between the cases is that in the first case df.loc[df.A...] returns a dataframe with some rows (though none with index value "b"), while in the second case df.loc[df.A...] returns a dataframe with zero rows.
I think that shouldn't make a difference.

In the original code .loc[df.A...] and .loc[["b"], :] are not directly combined in one expression, but the first one creates a selection of rows of the dataframe, this selection is processed further, and during this another expression uses the second .loc.

The traceback looks very similar to the one in #40235. Maybe both bugs have a common root cause.

Expected Output

df.loc[df.A < 10, :].loc[["b"], :] returns an empty dataframe like df.loc[df.A < 30, :].loc[["b"], :] does.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 2cb9652
python : 3.9.4.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.18362
machine : AMD64
...

pandas : 1.2.4
numpy : 1.20.2
pytz : 2021.1
dateutil : 2.8.1
...

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugError ReportingIncorrect or improved errors from pandasIndexingRelated to indexing on series/frames, not to indexes themselvesMultiIndex

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions