Skip to content

Commit a52df7d

Browse files
committed
Merge PR #2607
2 parents e7f2a4e + 4a45512 commit a52df7d

File tree

6 files changed

+1039
-732
lines changed

6 files changed

+1039
-732
lines changed

RELEASE.rst

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,9 @@ pandas 0.10.1
3434
**Improvements to existing features**
3535

3636
- ``HDFStore``
37+
3738
- enables storing of multi-index dataframes (closes GH1277_)
38-
- support data column indexing and selection, via ``data_columns`` keyword
39-
in append
39+
- support data column indexing and selection, via ``data_columns`` keyword in append
4040
- support write chunking to reduce memory footprint, via ``chunksize``
4141
keyword to append
4242
- support automagic indexing via ``index`` keywork to append
@@ -50,18 +50,22 @@ pandas 0.10.1
5050
to do multiple-table append/selection
5151
- added support for datetime64 in columns
5252
- added method ``unique`` to select the unique values in an indexable or data column
53+
- added method ``copy`` to copy an existing store (and possibly upgrade)
54+
- show the shape of the data on disk for non-table stores when printing the store
5355
- Add ``logx`` option to DataFrame/Series.plot (GH2327_, #2565)
5456
- Support reading gzipped data from file-like object
5557
- ``pivot_table`` aggfunc can be anything used in GroupBy.aggregate (GH2643_)
5658

5759
**Bug fixes**
5860

5961
- ``HDFStore``
62+
6063
- correctly handle ``nan`` elements in string columns; serialize via the
6164
``nan_rep`` keyword to append
6265
- raise correctly on non-implemented column types (unicode/date)
6366
- handle correctly ``Term`` passed types (e.g. ``index<1000``, when index
6467
is ``Int64``), (closes GH512_)
68+
- handle Timestamp correctly in data_columns (closes GH2637_)
6569
- Fix DataFrame.info bug with UTF8-encoded columns. (GH2576_)
6670
- Fix DatetimeIndex handling of FixedOffset tz (GH2604_)
6771
- More robust detection of being in IPython session for wide DataFrame
@@ -78,19 +82,22 @@ pandas 0.10.1
7882
**API Changes**
7983

8084
- ``HDFStore``
85+
86+
- refactored HFDStore to deal with non-table stores as objects, will allow future enhancements
8187
- removed keyword ``compression`` from ``put`` (replaced by keyword
8288
``complib`` to be consistent across library)
8389

8490
.. _GH512: https://github.com/pydata/pandas/issues/512
8591
.. _GH1277: https://github.com/pydata/pandas/issues/1277
8692
.. _GH2327: https://github.com/pydata/pandas/issues/2327
87-
.. _GH2576: https://github.com/pydata/pandas/issues/2576
8893
.. _GH2585: https://github.com/pydata/pandas/issues/2585
8994
.. _GH2599: https://github.com/pydata/pandas/issues/2599
9095
.. _GH2604: https://github.com/pydata/pandas/issues/2604
96+
.. _GH2576: https://github.com/pydata/pandas/issues/2576
9197
.. _GH2616: https://github.com/pydata/pandas/issues/2616
9298
.. _GH2625: https://github.com/pydata/pandas/issues/2625
9399
.. _GH2643: https://github.com/pydata/pandas/issues/2643
100+
.. _GH2637: https://github.com/pydata/pandas/issues/2637
94101

95102
pandas 0.10.0
96103
=============

doc/source/_static/legacy_0.10.h5

233 KB
Binary file not shown.

doc/source/io.rst

Lines changed: 25 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1106,7 +1106,7 @@ Storing Mixed Types in a Table
11061106
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
11071107

11081108
Storing mixed-dtype data is supported. Strings are store as a fixed-width using the maximum size of the appended column. Subsequent appends will truncate strings at this length.
1109-
Passing ``min_itemsize = { `values` : size }`` as a parameter to append will set a larger minimum for the string columns. Storing ``floats, strings, ints, bools, datetime64`` are currently supported. For string columns, passing ``nan_rep = 'my_nan_rep'`` to append will change the default nan representation on disk (which converts to/from `np.nan`), this defaults to `nan`.
1109+
Passing ``min_itemsize = { `values` : size }`` as a parameter to append will set a larger minimum for the string columns. Storing ``floats, strings, ints, bools, datetime64`` are currently supported. For string columns, passing ``nan_rep = 'nan'`` to append will change the default nan representation on disk (which converts to/from `np.nan`), this defaults to `nan`.
11101110

11111111
.. ipython:: python
11121112
@@ -1115,9 +1115,6 @@ Passing ``min_itemsize = { `values` : size }`` as a parameter to append will set
11151115
df_mixed['int'] = 1
11161116
df_mixed['bool'] = True
11171117
df_mixed['datetime64'] = Timestamp('20010102')
1118-
1119-
# make sure that we have datetime64[ns] types
1120-
df_mixed = df_mixed.convert_objects()
11211118
df_mixed.ix[3:5,['A','B','string','datetime64']] = np.nan
11221119
11231120
store.append('df_mixed', df_mixed, min_itemsize = { 'values' : 50 })
@@ -1128,8 +1125,6 @@ Passing ``min_itemsize = { `values` : size }`` as a parameter to append will set
11281125
# we have provided a minimum string column size
11291126
store.root.df_mixed.table
11301127
1131-
It is ok to store ``np.nan`` in a ``float or string``. Make sure to do a ``convert_objects()`` on the frame before storing a ``np.nan`` in a datetime64 column. Storing a column with a ``np.nan`` in a ``int, bool`` will currently throw an ``Exception`` as these columns will have converted to ``object`` type.
1132-
11331128
Storing Multi-Index DataFrames
11341129
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
11351130

@@ -1268,11 +1263,11 @@ To retrieve the *unique* values of an indexable or data column, use the method `
12681263
12691264
**Table Object**
12701265

1271-
If you want to inspect the table object, retrieve via ``get_table``. You could use this progamatically to say get the number of rows in the table.
1266+
If you want to inspect the stored object, retrieve via ``get_storer``. You could use this progamatically to say get the number of rows in an object.
12721267

12731268
.. ipython:: python
12741269
1275-
store.get_table('df_dc').nrows
1270+
store.get_storer('df_dc').nrows
12761271
12771272
Multiple Table Queries
12781273
~~~~~~~~~~~~~~~~~~~~~~
@@ -1348,7 +1343,7 @@ Or on-the-fly compression (this only applies to tables). You can turn off file c
13481343

13491344
- ``ptrepack --chunkshape=auto --propindexes --complevel=9 --complib=blosc in.h5 out.h5``
13501345

1351-
Furthermore ``ptrepack in.h5 out.h5`` will *repack* the file to allow you to reuse previously deleted space (alternatively, one can simply remove the file and write again).
1346+
Furthermore ``ptrepack in.h5 out.h5`` will *repack* the file to allow you to reuse previously deleted space. Aalternatively, one can simply remove the file and write again, or use the ``copy`` method.
13521347

13531348
Notes & Caveats
13541349
~~~~~~~~~~~~~~~
@@ -1372,10 +1367,28 @@ Notes & Caveats
13721367
Compatibility
13731368
~~~~~~~~~~~~~
13741369

1375-
0.10 of ``HDFStore`` is backwards compatible for reading tables created in a prior version of pandas,
1376-
however, query terms using the prior (undocumented) methodology are unsupported. ``HDFStore`` will issue a warning if you try to use a prior-version format file. You must read in the entire
1377-
file and write it out using the new format to take advantage of the updates. The group attribute ``pandas_version`` contains the version information.
1370+
0.10.1 of ``HDFStore`` is backwards compatible for reading tables created in a prior version of pandas however, query terms using the prior (undocumented) methodology are unsupported. ``HDFStore`` will issue a warning if you try to use a prior-version format file. You must read in the entire file and write it out using the new format, using the method ``copy`` to take advantage of the updates. The group attribute ``pandas_version`` contains the version information. ``copy`` takes a number of options, please see the docstring.
1371+
1372+
1373+
.. ipython:: python
1374+
1375+
# a legacy store
1376+
import os
1377+
legacy_store = HDFStore('legacy_0.10.h5', 'r')
1378+
legacy_store
13781379

1380+
# copy (and return the new handle)
1381+
new_store = legacy_store.copy('store_new.h5')
1382+
new_store
1383+
new_store.close()
1384+
1385+
.. ipython:: python
1386+
:suppress:
1387+
1388+
legacy_store.close()
1389+
import os
1390+
os.remove('store_new.h5')
1391+
13791392
13801393
Performance
13811394
~~~~~~~~~~~

doc/source/v0.10.1.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,9 @@ New features
1717
HDFStore
1818
~~~~~~~~
1919

20+
You may need to upgrade your existing data files. Please visit the **compatibility** section in the main docs.
21+
22+
2023
.. ipython:: python
2124
:suppress:
2225
:okexcept:

0 commit comments

Comments
 (0)