Closed
Description
Found some issues with lib.maybe_convert_objects
(which is an internal library function, but the same problems as below occur when calling DataFrame({'a': [...]})
, for example)
(Also, mostly unrelated to the below, I noticed the inference loop continues for no reason after seeing an object, when it can safely break because the output is a forgone conclusion: added the breaks to make it slightly more efficient.)
Fails on python long integer outside np.int64
range:
In [4]: arr = np.asarray([2**63], dtype=np.object_)
In [5]: p.lib.maybe_convert_objects(arr)
(exception)
Probably should leave the integer unchanged as a long integer object
Datetimes and complexes lost completely when mixed with booleans:
It's not even doing a cast: it's reading uninitialized memory as boolean
In [6]: arr = np.asarray([dt.datetime(2005, 1, 1), True])
In [7]: p.lib.maybe_convert_objects(arr, convert_datetime=1)
Out[7]: array([ True, True], dtype=bool)
In [8]: arr = np.asarray([1.0+2.0j, True], dtype=np.object_)
In [9]: p.lib.maybe_convert_objects(arr)
Out[9]: array([ True, True], dtype=bool)
None
converts to np.nan
with floats, but not complexes:
In [10]: arr = np.asarray([1.0, None], dtype=np.object_)
In [11]: p.lib.maybe_convert_objects(arr)
Out[11]: array([ 1., nan])
In [12]: arr = np.asarray([1.0+2.0j, None], dtype=np.object_)
In [13]: p.lib.maybe_convert_objects(arr)
Out[13]: array([(1+2j), None], dtype=object)
safe
option preserves integer types with np.nan
but not None
:
The comment for safe
says "don't cast int to float, etc." but that is not true with None
In [14]: arr = np.asarray([1, 2.0, np.nan], dtype=np.object_)
In [15]: p.lib.maybe_convert_objects(arr, safe=1)
Out[15]: array([1, 2.0, nan], dtype=object)
In [16]: arr = np.asarray([1, 2.0, None], dtype=np.object_)
In [17]: p.lib.maybe_convert_objects(arr, safe=1)
Out[17]: array([ 1., 2., nan])
Not entirely sure if this last one is intended behavior or not; let me know if there is any feedback.
Metadata
Metadata
Assignees
Labels
No labels