Skip to content

quantile throws error if not convertible to float #2625

Closed
@hayd

Description

@hayd

If we try and quantile a DataFrame with string entries which are not convertible, there is a ValueError. Should this behave like mean (and ignore these entries)? (taken from this StackOverflow question).

In [1]: df = DataFrame({'col1':['A','A','B','B'], 'col2':[1,2,3,4]})

In [2]: df
Out[2]:
  col1  col2
0    A     1
1    A     2
2    B     3
3    B     4


In [3]: g = df.groupby('col1')

In [4]: g.mean()
Out[4]: 
      col2
col1      
A      1.5
B      3.5

In [5]: g.quantile()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/home/andy/<ipython-input-70-8b0757805794> in <module>()
----> 1 g.quantile()

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in wrapper(*args, **kwargs)
    258                 return self.apply(curried_with_axis)
    259             except Exception:
--> 260                 return self.apply(curried)
    261 
    262         return wrapper

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in apply(self, func, *args, **kwargs)
    319         func = _intercept_function(func)
    320         f = lambda g: func(g, *args, **kwargs)
--> 321         return self._python_apply_general(f)
    322 
    323     def _python_apply_general(self, f):

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in _python_apply_general(self, f)
    322 
    323     def _python_apply_general(self, f):
--> 324         keys, values, mutated = self.grouper.apply(f, self.obj, self.axis)
    325 
    326         return self._wrap_applied_output(keys, values,

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in apply(self, f, data, axis, keep_internal)
    594             # group might be modified

    595             group_axes = _get_axes(group)
--> 596             res = f(group)
    597             if not _is_indexed_like(res, group_axes):
    598                 mutated = True

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in <lambda>(g)
    318         """
    319         func = _intercept_function(func)
--> 320         f = lambda g: func(g, *args, **kwargs)
    321         return self._python_apply_general(f)
    322 

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in curried(x)
    253 
    254             def curried(x):
--> 255                 return f(x, *args, **kwargs)
    256 
    257             try:

/usr/lib/pymodules/python2.7/pandas/core/frame.pyc in quantile(self, q, axis)
   4946                 return _quantile(arr, per)
   4947 
-> 4948         return self.apply(f, axis=axis)
   4949 
   4950     def clip(self, upper=None, lower=None):

/usr/lib/pymodules/python2.7/pandas/core/frame.pyc in apply(self, func, axis, broadcast, raw, args, **kwds)
   4079                     return self._apply_raw(f, axis)
   4080                 else:
-> 4081                     return self._apply_standard(f, axis)
   4082             else:
   4083                 return self._apply_broadcast(f, axis)

/usr/lib/pymodules/python2.7/pandas/core/frame.pyc in _apply_standard(self, func, axis, ignore_failures)
   4154                     # no k defined yet

   4155                     pass
-> 4156                 raise e
   4157 
   4158         if len(results) > 0 and _is_sequence(results[0]):

ValueError: ('could not convert string to float: A', u'occurred at index col1')

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions