Description
Code Sample, a copy-pastable example if possible
import numpy as np
import pandas as pd
df = pd.DataFrame({'a': [np.nan, np.nan, np.nan, np.nan, np.nan, 1., np.nan],
'b': [1., np.nan, np.nan, 1., np.nan, np.nan, np.nan]}).to_sparse()
print(type(df))
print(type(df.stack()))
<class 'pandas.sparse.frame.SparseDataFrame'>
<class 'pandas.core.series.Series'>
Problem description
I'm trying to convert a SparseDataFrame (obtained it from pd.get_dummies()) into a scipy sparse matrix, by using the experimental .to_coo(). As this method accepts a MultiIndex Series, instead of a DataFrame, i call the .stack() method of this SparseDataFrame.
The problem is that it looks like the .stack() method doesn't process the SparseDataFrame as sparse, and instead stacks it as dense, consuming too much memory, and returning a (dense) Series.
Returning a dense Series could be all right, as np.nan values are drop by default with the dropna parameters, but the memory consumption is a problem.
I'm aware the whole sparse functionality is not yet mature. And I saw the function pd.sparse.frame.stack_sparse_frame which I guess it's a step to fix this problem (which doesn't work for me). But as I couldn't find a specific issue for this problem, I thought it was worth opening it.
Expected Output
<class 'pandas.sparse.frame.SparseDataFrame'>
<class 'pandas.sparse.series.SparseSeries'>
Output of pd.show_versions()
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.7.5-100.fc23.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.utf8
LOCALE: en_US.UTF-8pandas: 0.19.2+0.g825876c.dirty
nose: 1.3.7
pip: 9.0.1
setuptools: 23.0.0
Cython: 0.24
numpy: 1.11.1
scipy: 0.17.1
statsmodels: 0.6.1
xarray: None
IPython: 4.2.0
sphinx: 1.4.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: 1.1.0
tables: 3.2.3.1
numexpr: 2.6.0
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.2
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext)
jinja2: 2.8
boto: 2.40.0
pandas_datareader: None