Closed
Description
A bit a cryptic title, but following from my investigation of #21390 (comment).
When having a categorical values with "box-able" categories, eg:
In [1]: cat = pd.Categorical(pd.date_range("2012-01-01", periods=3, freq='H'))
In [3]: cat.tolist()
Out[3]:
[Timestamp('2012-01-01 00:00:00', freq='H'),
Timestamp('2012-01-01 01:00:00', freq='H'),
Timestamp('2012-01-01 02:00:00', freq='H')]
the boxing to Timestamps works for the Categorical itself, but once this is combined in 2D data structures, the boxing fails:
In [7]: midx = pd.MultiIndex.from_product([['a', 'b', 'c'], cat])
In [8]: midx.values
Out[8]:
array([('a', 1325376000000000000), ('a', 1325379600000000000),
('a', 1325383200000000000), ('b', 1325376000000000000),
('b', 1325379600000000000), ('b', 1325383200000000000),
('c', 1325376000000000000), ('c', 1325379600000000000),
('c', 1325383200000000000)], dtype=object)
In [9]: df = pd.DataFrame({'a':['a', 'b', 'c'], 'b': cat, 'c': np.array(cat)})
In [10]: df.dtypes
Out[10]:
a object
b category
c datetime64[ns]
dtype: object
In [11]: df.values
Out[11]:
array([['a', 1325376000000000000, Timestamp('2012-01-01 00:00:00')],
['b', 1325379600000000000, Timestamp('2012-01-01 01:00:00')],
['c', 1325383200000000000, Timestamp('2012-01-01 02:00:00')]], dtype=object)