Skip to content

Concatting dense and sparse dataframes breaks many common operations #16874

Closed
@kbattocchi

Description

@kbattocchi

xref #15737

Code Sample, a copy-pastable example if possible

import pandas as pd

pd.concat([pd.DataFrame([0.0]), pd.SparseDataFrame([0.0])], axis=1).isnull() 
pd.concat([pd.DataFrame([0.0]), pd.SparseDataFrame([0.0])], axis=1).density
pd.concat([pd.DataFrame([0.0], columns=['A']), pd.SparseDataFrame([0.0])], axis=1)['A'] 
pd.concat([pd.DataFrame([0.0], columns=['A']), pd.SparseDataFrame([0.0])], axis=1).iloc[0,0]

Problem description

Each of the above lines generates an error when the dataframes are of mixed sparsity, but would succeed if both dataframes were dense or both were sparse. This means that we can't seamlessly swap sparse dataframes for dense ones without knowing how they'll be used downstream.

Expected Output

Each line does not generate an error.

Output of pd.show_versions()

##INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.20.2
pytest: 2.9.2
pip: 8.1.2
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.13.0
scipy: 0.18.1
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.1.0
tables: 3.2.2
numexpr: 2.6.1
feather: None
matplotlib: 2.0.2
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: 0.4.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    ReshapingConcat, Merge/Join, Stack/Unstack, ExplodeSparseSparse Data Type

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions