Skip to content

BUG: invalid coercion raises in groupby uniques #14758

Closed
@jreback

Description

@jreback
df = pd.DataFrame([[1, 2], [3, 4], [5, 6]], dtype=object) + 9223372036854775807
df.groupby(0).sum()

...
C:\Anaconda\envs\bmc3\lib\site-packages\pandas\core\groupby.py in labels(self)
   2252     def labels(self):
   2253         if self._labels is None:
-> 2254             self._make_labels()
   2255         return self._labels
   2256

C:\Anaconda\envs\bmc3\lib\site-packages\pandas\core\groupby.py in _make_labels(self)
   2264         if self._labels is None or self._group_index is None:
   2265             labels, uniques = algos.factorize(self.grouper, sort=self.sort)
-> 2266             uniques = Index(uniques, name=self.name)
   2267             self._labels = labels
   2268             self._group_index = uniques

C:\Anaconda\envs\bmc3\lib\site-packages\pandas\indexes\base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    234                 if inferred == 'integer':
    235                     from .numeric import Int64Index
--> 236                     return Int64Index(subarr.astype('i8'), copy=copy,
    237                                       name=name)
    238                 elif inferred in ['floating', 'mixed-integer-float']:

OverflowError: int too big to convert

We are inferring int when the this should be 'object' (or just not integer):

In [36]: pd.lib.infer_dtype(np.array([9223372036854775808, 9223372036854775810, 9223372036854775812], dtype=object))
Out[36]: 'integer'

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions