Description
Code Sample
#!/usr/bin/env python3
import pandas as pd
from datetime import datetime
a = pd.DataFrame([[datetime.now(), 1234, 3.1415]], columns=['col0', 'col1', 'col2']).iloc[[]]
print(a.shape)
# (0, 3) <- 0 rows, 3 columns
print(a.dtypes)
# col0 datetime64[ns]
# col1 int64
# col2 float64
# dtype: object
b = a.set_index(['col0'])
print(b.reset_index().dtypes)
# col0 datetime64[ns] <- all preserved
# col1 int64
# col2 float64
# dtype: object
b['col3'] = []
print(b.reset_index().dtypes)
# index int64 <- column name is lost and dtype is changed to int64
# col1 int64
# col2 float64
# col3 float64
# dtype: object
c = a.set_index(['col0', 'col1'])
print(c.reset_index().dtypes)
# col0 float64 <- column names are preserved but dtypes are changed to float64
# col1 float64 <-
# col2 float64
# dtype: object
c['col3'] = []
print(c.reset_index().dtypes)
# col0 float64 <- column names are still preserved
# col1 float64 <-
# col2 float64
# col3 float64
# dtype: object
Problem description
When some operations for DataFrames with zero-rows are executed, various information of their indice are lost. Furthermore types and triggers of lost information are not inconsistent between MultiIndex and normal Index.
In case of DataFrame with non-MultiIndex, both dtype and x.index.name
are lost on appending new column by substitution of empty list object.
In case of having MultiIndex, dtypes are lost just on calling x.set_index([x, y,...])
. However x.index.names
are preserved on appending new column.
Expected Output
In my opinion, there are little bad effect if all dtype(s) and name(s) are preserved on any these example cases. and it's consistent with cases of operation for non-zero-rows DataFrame.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-862.3.2.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.23.0
pytest: 3.5.1
pip: 10.0.1
setuptools: 39.1.0
Cython: 0.28.2
numpy: 1.14.3
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.3
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: 2.5.3
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.4
lxml: 4.2.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.7
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None