Skip to content

SparseDataFrame should be able to handle also non float non sparse Columns #2873

Closed
@benjello

Description

@benjello

dup of #667
I use pandas version 0.10.1

When trying to build a DataFrame with columns that can equally be of the type int/float/str
and be sparse or not I ran into trouble.

from pandas import DataFrame
x = DataFrame(randn(10000, 2), columns = ['a', 'b'])
y = DataFrame(randn(10000, 2), columns = ['c', 'd'])

import random
import string
z = DataFrame( [ random.choice(string.letters) for i in range(10000)], columns = ['e'])
x.ix[:9998] = 0
x = x.to_sparse(fill_value=0)

print x.density
print y.__class__
df = concat([x, y])

works fine. But the following doesn't:

df2 = concat([x, y, z])

Also, the following example doesn't yield to a SparseDataFrame:

df = DataFrame(randn(10000, 4))
df.ix[:9998] = 0

df1 = df.to_sparse(fill_value=0)
print df1.density

df[0] = df[0].to_sparse(fill_value=0)
df[1] = df[1].to_sparse(fill_value=0) 
print df[1].__class__
print df.__class__
print SparseDataFrame(df[0]).density

but this might be a feature since modifying df[0] should not modify df.

Finally the following doesn't yield to a SparseDataFrame

x = Series(randn(10000), name='a')
x = x.to_sparse(fill_value=0)
print x.__class__
df = SparseDataFrame(x)
print df.__class__

I would be very happy to use the power of pandas to deal with sparse structures.
So my last item is a question: Is the development of sparse objects a priority of the pandas project team ?

Thank you for your for providing such a nice piece of software

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions