Open
Description
Research
-
I have searched the [pandas] tag on StackOverflow for similar questions.
-
I have asked my usage related question on StackOverflow.
Link to question on StackOverflow
Question about pandas
I found that setting pandas DataFrame column with a 2D numpy array whose dtype is object will cause a wierd error. I wonder why it happens.
The code I ran is as follows:
import numpy as np
import pandas as pd
print(f"numpy version: {np.__version__}")
print(f"pandas version: {pd.__version__}")
data = pd.DataFrame({
"c1": [1, 2, 3, 4, 5],
})
t1 = np.array([["A"], ["B"], ["C"], ["D"], ["E"]])
data["c1"] = t1 # This works well
t2 = np.array([["A"], ["B"], ["C"], ["D"], ["E"]], dtype=object)
data["c1"] = t2 # This throws an error
Result (some unrelated path removed):
numpy version: 2.2.3
pandas version: 2.2.3
Traceback (most recent call last):
File "...\test.py", line 15, in <module>
data["c1"] = t2 # This throws an error
~~~~^^^^^^
File "...\Anaconda\envs\test\Lib\site-packages\pandas\core\frame.py", line 4311, in __setitem__
self._set_item(key, value)
~~~~~~~~~~~~~~^^^^^^^^^^^^
File "...\Anaconda\envs\test\Lib\site-packages\pandas\core\frame.py", line 4524, in _set_item
value, refs = self._sanitize_column(value)
~~~~~~~~~~~~~~~~~~~~~^^^^^^^
File "...\Anaconda\envs\test\Lib\site-packages\pandas\core\frame.py", line 5267, in _sanitize_column
arr = sanitize_array(value, self.index, copy=True, allow_2d=True)
File "...\Anaconda\envs\test\Lib\site-packages\pandas\core\construction.py", line 606, in sanitize_array
subarr = maybe_infer_to_datetimelike(data)
File "...\Anaconda\envs\test\Lib\site-packages\pandas\core\dtypes\cast.py", line 1181, in maybe_infer_to_datetimelike
raise ValueError(value.ndim) # pragma: no cover
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: 2
I'm not sure whether it is the expected behaviour. I find it strange because simply adding dtype=object
will cause the error.