Skip to content

BUG: DataFrame.to_hdf doesn't pass along min_itemsize for index #10381

Closed
@TomAugspurger

Description

@TomAugspurger

Unless I'm seeing something wrong

In [21]: df = DataFrame(dict(A = 'foo', B = 'bar'),index=range(5)).set_index("A")

In [22]: df.to_hdf('store.h5', 'test', format='table', min_itemsize={'index': 10})

In [23]: store = pd.HDFStore('store.h5')

In [24]: store.get_storer('test').table
Out[24]:
/test/table (Table(5,)) ''
  description := {
  "index": StringCol(itemsize=3, shape=(), dflt=b'', pos=0),
  "values_block_0": StringCol(itemsize=3, shape=(1,), dflt=b'', pos=1)}   # <---- I think this should be 10
  byteorder := 'irrelevant'
  chunkshape := (10922,)
  autoindex := True
  colindexes := {
    "index": Index(6, medium, shuffle, zlib(1)).is_csi=False}

and FYI this raises (not sure if it should work)

In [25]: df.index.name = 'theindex'

In [26]: df.to_hdf('store.h5', 'test2', format='table', min_itemsize={'theindex': 10})
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

Just a report right now... no time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions