Skip to content

HDFStore.select slowed by decode even when using columns= #5441

Closed
@wabu

Description

@wabu

I realized when profiling a slow select (200% more wall-time as direct pytables call and high memory usage) that most of the time is spend inside bytes.decode called by _unconvert_strings_array, even when selecting only int64 columns. It seems spend time and memory to decode string that are never returned.

I'm using python 3.3 and latest pandas (commit 2d2e8b5).

I gladly get back with more details if needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    IO HDF5read_hdf, HDFStorePerformanceMemory or execution speed performanceUnicodeUnicode strings

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions