Skip to content

BUG: df.groupby(TimeGrouper(freq='48min', closed='right')) #6085

Closed
@shura-v

Description

@shura-v

related: #4197, #5694

I'm encoutering a bug in pandas 0.13.0:

df = pandas.read_hdf('d:\\dropbox\\s.h5', 'data')
df.groupby(TimeGrouper(freq='48min', closed='right'))
Traceback (most recent call last):
  File "D:/Dropbox/hf/tests/hdf.py", line 94, in <module>
    main()
  File "D:/Dropbox/hf/tests/hdf.py", line 74, in load
    df.groupby(TimeGrouper(freq='48min', closed='right'))
  File "C:\Anaconda\lib\site-packages\pandas\core\generic.py", line 2434, in groupby
    sort=sort, group_keys=group_keys, squeeze=squeeze)
  File "C:\Anaconda\lib\site-packages\pandas\core\groupby.py", line 789, in groupby
    return klass(obj, by, **kwds)
  File "C:\Anaconda\lib\site-packages\pandas\core\groupby.py", line 238, in __init__
    level=level, sort=sort)
  File "C:\Anaconda\lib\site-packages\pandas\core\groupby.py", line 1563, in _get_grouper
    gpr = key.get_grouper(obj)
  File "C:\Anaconda\lib\site-packages\pandas\tseries\resample.py", line 109, in get_grouper
    return self._get_time_grouper(obj)[1]
  File "C:\Anaconda\lib\site-packages\pandas\tseries\resample.py", line 115, in _get_time_grouper
    binner, bins, binlabels = self._get_time_bins(axis)
  File "C:\Anaconda\lib\site-packages\pandas\tseries\resample.py", line 150, in _get_time_bins
    bins = lib.generate_bins_dt64(ax_values, bin_edges, self.closed)
  File "lib.pyx", line 935, in pandas.lib.generate_bins_dt64 (pandas\lib.c:14968)
ValueError: Values falls after last bin

This is the max length when it works fine:

df = pandas.read_hdf('d:\\dropbox\\s.h5', 'data')
df = df[:23748]
df.groupby(TimeGrouper(freq='48min', closed='right'))

Also it works when I group it with TimeGrouper(freq='48min')

Here is data frame:
https://www.dropbox.com/s/ezrjralhb2yordm/s.h5

INSTALLED VERSIONS
------------------
Python: 2.7.5.final.0
OS: Windows
Release: 8
Processor: Intel64 Family 6 Model 62 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.13.0
Cython: 0.19.2
Numpy: 1.7.1
Scipy: 0.13.2
statsmodels: 0.5.0
    patsy: 0.2.1
scikits.timeseries: Not installed
dateutil: 1.5
pytz: 2013b
bottleneck: Not installed
PyTables: 3.0.0
    numexpr: 2.2.2
matplotlib: 1.3.1
openpyxl: 1.6.2
xlrd: 0.9.2
xlwt: 0.7.5
xlsxwriter: Not installed
sqlalchemy: 0.8.3
lxml: 3.2.3
bs4: 4.3.1
html5lib: Not installed
bigquery: Not installed
apiclient: Not installed

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds InfoClarification about behavior needed to assess issueResampleresample methodTimezonesTimezone data dtype

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions