Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the master branch of pandas.
Reproducible Example
# !pip install -U pandas
import pandas as pd
import numpy as np
# Create arbitrary 3-channel 8-bit image
data = np.random.randint(0, 256, size=(50,50, 3), dtype='uint8')
# Replicate this image and treat each image as a row of pixel features
rows = [data.reshape(-1)]*2
assert all(r.dtype == np.uint8 for r in rows)
print(pd.DataFrame(rows).dtypes)
# int8???
print(pd.DataFrame(np.vstack(rows)).dtypes)
# uint8 (expected)
Issue Description
Apologies if this issue already exists -- type-related issues are hard for me to navigate in github since there are too many to easily parse.
uint8
subarray dtypes are silently converted to int8
when constructing a dataframe from lists.
I tried testing on master but Windows has a build error for me (stack trace below),
building 'pandas._libs.algos' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
----------------------------------------
ERROR: Failed building wheel for pandas
Failed to build pandas
ERROR: Could not build wheels for pandas which use PEP 517 and cannot be installed directly
Expected Behavior
The dtype of each unit in each column is a uint8
, so I would expect uint8
resulting columns
Installed Versions
INSTALLED VERSIONS
commit : 73c6825
python : 3.9.4.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.3.3
numpy : 1.19.5
pytz : 2021.1
dateutil : 2.8.1
pip : 21.1.3
setuptools : 52.0.0.post20210125
Cython : None
pytest : 6.2.4
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 1.4.3
lxml.etree : 4.6.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.23.1
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.2
numexpr : None
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.6.3
sqlalchemy : 1.4.15
tables : None
tabulate : None
xarray : None
xlrd : 2.0.1
xlwt : None
numba : None