Skip to content

GroupBy Filter does not work for 2+ grouping columns in 0.12.0 #4527

Closed
@dragoljub

Description

@dragoljub

Trying to filter out groups with less than 1 row with the new groupby().filter() in 0.12.0:

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
   ....:                  'foo', 'bar', 'foo', 'foo'],
   ....:           'B' : ['one', 'one', 'two', 'three',
   ....:                  'two', 'two', 'one', 'three'],
   ....:           'C' : np.random.randn(8),
   ....:           'D' : np.random.randn(8)})

grp = df.groupby(['A', 'B'])

grp.apply(lambda x: len(x)>1) # Returns correct bool series

A    B    
bar  one      False
     three    False
     two      False
foo  one       True
     three    False
     two       True
dtype: bool

grp.filter(lambda x: len(x)>1)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-140-4dccfb561597> in <module>()
----> 1 grp.filter(lambda x: len(x)>1)

C:\Python27\lib\site-packages\pandas\core\groupby.pyc in filter(self, func, dropna, *args, **kwargs)
   2092                 res = path(group)
   2093 
-> 2094             if res:
   2095                 indexers.append(self.obj.index.get_indexer(group.index))
   2096 

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions