Skip to content

BUG: Cannot index frozenset elements from a pd.Series or pd.DataFrame #35747

Closed
@jolespin

Description

@jolespin
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

print(pd.__version__)
# 1.1.0

# Create DataFrame
data = {0: {frozenset({'Otu000010', 'Otu000505'}): 'white',
  frozenset({'Otu000067', 'Otu000073'}): 'white',
  frozenset({'Otu000151', 'molar'}): 'white',
  frozenset({'Otu000380', 'etec_stp'}): 'white',
  frozenset({'Otu000281', 'ghrp'}): 'white'},
 14: {frozenset({'Otu000010', 'Otu000505'}): 'white',
  frozenset({'Otu000067', 'Otu000073'}): 'white',
  frozenset({'Otu000151', 'molar'}): 'white',
  frozenset({'Otu000380', 'etec_stp'}): 'white',
  frozenset({'Otu000281', 'ghrp'}): 'white'},
 28: {frozenset({'Otu000010', 'Otu000505'}): 'red',
  frozenset({'Otu000067', 'Otu000073'}): 'white',
  frozenset({'Otu000151', 'molar'}): 'blue',
  frozenset({'Otu000380', 'etec_stp'}): 'white',
  frozenset({'Otu000281', 'ghrp'}): 'white'}}
df = pd.DataFrame(data)

# Get first item in index
id_query = df.index[0]

# Grab a column
vector = df[14]

# Index the vector using query
vector[id_query]


# ---------------------------------------------------------------------------
# KeyError                                  Traceback (most recent call last)
# <ipython-input-145-493fc06aeb40> in <module>
#      24 
#      25 # Index the vector using query
# ---> 26 vector[id_query]

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/series.py in __getitem__(self, key)
#     906             return self._get_values(key)
#     907 
# --> 908         return self._get_with(key)
#     909 
#     910     def _get_with(self, key):

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/series.py in _get_with(self, key)
#     946 
#     947         # handle the dup indexing case GH#4246
# --> 948         return self.loc[key]
#     949 
#     950     def _get_values_tuple(self, key):

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/indexing.py in __getitem__(self, key)
#     877 
#     878             maybe_callable = com.apply_if_callable(key, self.obj)
# --> 879             return self._getitem_axis(maybe_callable, axis=axis)
#     880 
#     881     def _is_scalar_access(self, key: Tuple):

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
#    1097                     raise ValueError("Cannot index with multidimensional key")
#    1098 
# -> 1099                 return self._getitem_iterable(key, axis=axis)
#    1100 
#    1101             # nested tuple slicing

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/indexing.py in _getitem_iterable(self, key, axis)
#    1035 
#    1036         # A collection of keys
# -> 1037         keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
#    1038         return self.obj._reindex_with_indexers(
#    1039             {axis: [keyarr, indexer]}, copy=True, allow_dups=True

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
#    1252             keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
#    1253 
# -> 1254         self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
#    1255         return keyarr, indexer
#    1256 

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
#    1296             if missing == len(indexer):
#    1297                 axis_name = self.obj._get_axis_name(axis)
# -> 1298                 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
#    1299 
#    1300             # We (temporarily) allow for some missing keys with .loc, except in

# KeyError: "None of [Index(['Otu000010', 'Otu000505'], dtype='object')] are in the [index]"

Problem description

In previous versions, I was able to use frozenset objects as the elements of the index. These are great objects to have for network analysis where I use as edges in my pd.Series and pd.DataFrame

Expected Output

I should be able to index using these objects.

Output of pd.show_versions()

pandas v1.1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    IndexingRelated to indexing on series/frames, not to indexes themselvesRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions