Closed
Description
dup of #667
I use pandas version 0.10.1
When trying to build a DataFrame with columns that can equally be of the type int/float/str
and be sparse or not I ran into trouble.
from pandas import DataFrame
x = DataFrame(randn(10000, 2), columns = ['a', 'b'])
y = DataFrame(randn(10000, 2), columns = ['c', 'd'])
import random
import string
z = DataFrame( [ random.choice(string.letters) for i in range(10000)], columns = ['e'])
x.ix[:9998] = 0
x = x.to_sparse(fill_value=0)
print x.density
print y.__class__
df = concat([x, y])
works fine. But the following doesn't:
df2 = concat([x, y, z])
Also, the following example doesn't yield to a SparseDataFrame:
df = DataFrame(randn(10000, 4))
df.ix[:9998] = 0
df1 = df.to_sparse(fill_value=0)
print df1.density
df[0] = df[0].to_sparse(fill_value=0)
df[1] = df[1].to_sparse(fill_value=0)
print df[1].__class__
print df.__class__
print SparseDataFrame(df[0]).density
but this might be a feature since modifying df[0] should not modify df.
Finally the following doesn't yield to a SparseDataFrame
x = Series(randn(10000), name='a')
x = x.to_sparse(fill_value=0)
print x.__class__
df = SparseDataFrame(x)
print df.__class__
I would be very happy to use the power of pandas to deal with sparse structures.
So my last item is a question: Is the development of sparse objects a priority of the pandas project team ?
Thank you for your for providing such a nice piece of software