Skip to content

Add DataFrame.sparse accessor #25681

Closed
@TomAugspurger

Description

@TomAugspurger

I'd like to add a .sparse accessor to DataFrame, to assist with deprecating SparseDataFrame.

It'll contain

  • from_spmatrix (part of the SparseDataFrame constructor)
  • to_dense (SparseDataFrame.to_dense)
  • to_coo (SparseDataFrame.to_coo)
  • density

A few design questions:

  1. When should the _validate raise?
    a. When there are no sparse columns
    b. When there is any non-sparse columns
    c. Never.

It's slightly easier to implement if we assume everything is sparse.

  1. Return value of DataFrame.sparse.density. If we mirror SparseDataFrame.density, this returns a float. Would it be more useful to return a Series with the density of each column? (and users can .mean() if they want the average density)

I believe that with these methods, the essentially all the functionality of SparseDataFrame will be replicable with a DataFrame of sparse values (the main exception being an expanding __setitem__ creating a sparse column by default; but that's OK to not provide that functionality).

Metadata

Metadata

Assignees

No one assigned

    Labels

    SparseSparse Data Type

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions