Another HDFStore error

After a long run of extracting features for some random forest action I ran into this when serializing the features:

Traceback (most recent call last):
  File "XXXXX.py", line 1043, in <module>
    write_dataframe("features", all_df, store)
  File "XXXXX.py", line 55, in write_dataframe
    store[name] = df
  File "/Library/Python/2.7/site-packages/pandas/io/pytables.py", line 218, in **setitem**
    self.put(key, value)
  File "/Library/Python/2.7/site-packages/pandas/io/pytables.py", line 458, in put
    self._write_to_group(key, value, table=table, append=append, *_kwargs)
  File "/Library/Python/2.7/site-packages/pandas/io/pytables.py", line 788, in _write_to_group
    s.write(obj = value, append=append, complib=complib, *_kwargs)
  File "/Library/Python/2.7/site-packages/pandas/io/pytables.py", line 1837, in write
    self.write_array('block%d_values' % i, blk.values)
  File "/Library/Python/2.7/site-packages/pandas/io/pytables.py", line 1639, in write_array
    self.handle.createArray(self.group, key, value)
  File "/Library/Python/2.7/site-packages/tables-2.4.0-py2.7-macosx-10.8-intel.egg/tables/file.py", line 780, in createArray
    object=object, title=title, byteorder=byteorder)
  File "/Library/Python/2.7/site-packages/tables-2.4.0-py2.7-macosx-10.8-intel.egg/tables/array.py", line 167, in __init__
    byteorder, _log)
  File "/Library/Python/2.7/site-packages/tables-2.4.0-py2.7-macosx-10.8-intel.egg/tables/leaf.py", line 263, in __init__
    super(Leaf, self).**init**(parentNode, name, _log)
  File "/Library/Python/2.7/site-packages/tables-2.4.0-py2.7-macosx-10.8-intel.egg/tables/node.py", line 250, in __init__
    self._v_objectID = self._g_create()
  File "/Library/Python/2.7/site-packages/tables-2.4.0-py2.7-macosx-10.8-intel.egg/tables/array.py", line 200, in _g_create
    nparr, self._v_new_title, self.atom)
  File "hdf5Extension.pyx", line 884, in tables.hdf5Extension.Array._createArray (tables/hdf5Extension.c:8498)
tables.exceptions.HDF5ExtError: Problems creating the Array.

The error is pretty undefined, I know the table I was writing was big,  >17000 columns by > 20000 rows.  There are lots of np.nan's in the columns.

Since I seem to be one of the few who are serializing massive sets, and I have a 64GB RAM machine sitting next to me, are there some test cases that I can run, or write that would help?  Thinking setting up large mixed dataframes etc...


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Another HDFStore error #2784

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Another HDFStore error #2784

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions