Skip to content

BUG: Various issues with maybe_convert_objects #2845

Closed
@stephenwlin

Description

@stephenwlin

Found some issues with lib.maybe_convert_objects (which is an internal library function, but the same problems as below occur when calling DataFrame({'a': [...]}), for example)

(Also, mostly unrelated to the below, I noticed the inference loop continues for no reason after seeing an object, when it can safely break because the output is a forgone conclusion: added the breaks to make it slightly more efficient.)

Fails on python long integer outside np.int64 range:
In [4]: arr = np.asarray([2**63], dtype=np.object_)

In [5]: p.lib.maybe_convert_objects(arr)
(exception)

Probably should leave the integer unchanged as a long integer object

Datetimes and complexes lost completely when mixed with booleans:

It's not even doing a cast: it's reading uninitialized memory as boolean

In [6]: arr = np.asarray([dt.datetime(2005, 1, 1), True])

In [7]: p.lib.maybe_convert_objects(arr, convert_datetime=1)
Out[7]: array([ True,  True], dtype=bool)

In [8]: arr = np.asarray([1.0+2.0j, True], dtype=np.object_)

In [9]: p.lib.maybe_convert_objects(arr)
Out[9]: array([ True,  True], dtype=bool)
None converts to np.nan with floats, but not complexes:
In [10]: arr = np.asarray([1.0, None], dtype=np.object_)

In [11]: p.lib.maybe_convert_objects(arr)
Out[11]: array([  1.,  nan])

In [12]: arr = np.asarray([1.0+2.0j, None], dtype=np.object_)

In [13]: p.lib.maybe_convert_objects(arr)
Out[13]: array([(1+2j), None], dtype=object)
safe option preserves integer types with np.nan but not None:

The comment for safe says "don't cast int to float, etc." but that is not true with None

In [14]: arr = np.asarray([1, 2.0, np.nan], dtype=np.object_)

In [15]: p.lib.maybe_convert_objects(arr, safe=1)
Out[15]: array([1, 2.0, nan], dtype=object)

In [16]: arr = np.asarray([1, 2.0, None], dtype=np.object_)

In [17]: p.lib.maybe_convert_objects(arr, safe=1)
Out[17]: array([  1.,   2.,  nan])

Not entirely sure if this last one is intended behavior or not; let me know if there is any feedback.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions