Description
The HDFStore.append_to_multiple
passes on its entire min_itemsize
argument to every sub-append. Because not all columns are in every append, it fails when it tries to set a min_itemsize for a certain column when appending to a table that doesn't use that column.
Simple example:
>>> store.append_to_multiple({
... 'index': ["IX"],
... 'nums': ["Num", "BigNum", "RandNum"],
... "strs": ["Str", "LongStr"]
... }, d.iloc[[0]], 'index', min_itemsize={"Str": 10, "LongStr": 100})
Traceback (most recent call last):
File "<pyshell#52>", line 5, in <module>
}, d.iloc[[0]], 'index', min_itemsize={"Str": 10, "LongStr": 100})
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\io\pytables.py", line 1002, in append_to_multiple
self.append(k, val, data_columns=dc, **kwargs)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\io\pytables.py", line 920, in append
**kwargs)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\io\pytables.py", line 1265, in _write_to_group
s.write(obj=value, append=append, complib=complib, **kwargs)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\io\pytables.py", line 3773, in write
**kwargs)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\io\pytables.py", line 3460, in create_axes
self.validate_min_itemsize(min_itemsize)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\io\pytables.py", line 3101, in validate_min_itemsize
"data_column" % k)
ValueError: min_itemsize has the key [LongStr] which is not an axis or data_column
This apparently means that you can't use min_itemsize
without manually creating and appending to each separate table beforehand, which is the kind of thing append_to_multiple
is supposed to shield you from.
I think append_to_multiple
should special-case min_itemsize
and split it into separate dicts for each sub-append. I don't know if there are other potential kwargs that need to be "allocated" separately to sub-appends, but if there are it might be good to split them too.