-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
!I fix for BUG: resample with tz-aware: Values falls after last bin #15549 #18337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 18 commits
68a02aa
47b99ac
04ce929
a43bb19
44fc3f2
2a1d24c
d28ff98
7c770c0
effa650
aa8c0af
2db6576
2744ad3
fb4d405
93842d0
fc795c6
514fe24
777a12a
037c9dc
d5ac67e
61e84d6
e049112
e252b20
1888355
f4ed7c0
9b18f4b
5b035f0
0ec3bd0
b523915
a1b59c4
8cd36c9
aec765f
2e77f72
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1130,6 +1130,9 @@ def _get_time_bins(self, ax): | |
closed=self.closed, | ||
base=self.base) | ||
tz = ax.tz | ||
values_present = isinstance(getattr(self, 'obj', None), | ||
(pd.DataFrame, pd.Series)) | ||
|
||
# GH #12037 | ||
# use first/last directly instead of call replace() on them | ||
# because replace() will swallow the nanosecond part | ||
|
@@ -1139,7 +1142,8 @@ def _get_time_bins(self, ax): | |
start=first, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what exactly are you trying to do here? what is the purpose of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the case is described here #15549 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so in case if we are resampling series I believe an extra bin should be generated to handle this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. purpose of values_present is to be able to say which case is that, that we are creating datetimeindex for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hope that makes sense There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand the case. My point is that you can detect what you need and simply generate different bins at that point, rather than intrude to the implementation of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the soln that you posted in the issue looks promising. did you try that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep, last time I made a PR with it you said that it is not right to resample by UTC a tz aware series There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah that might be a bigger change, ok (and maybe not what we want semantically) |
||
end=last, | ||
tz=tz, | ||
name=ax.name) | ||
name=ax.name, | ||
values_present=values_present) | ||
|
||
# a little hack | ||
trimmed = False | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2719,6 +2719,12 @@ def test_resample_weekly_bug_1726(self): | |
# it works! | ||
df.resample('W-MON', closed='left', label='left').first() | ||
|
||
def test_resample_tz_aware_bug_15549(self): | ||
index = pd.DatetimeIndex([1450137600000000000, 1474059600000000000], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. move the issue number to a comment (and out of the title) |
||
tz='UTC').tz_convert('America/Chicago') | ||
df = pd.DataFrame([1, 2], index=index) | ||
df.resample('12h', closed='right', label='right').last().ffill() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add an expected and compare using assert_frame_equal |
||
|
||
def test_resample_bms_2752(self): | ||
# GH2753 | ||
foo = Series(index=pd.bdate_range('20000101', '20000201')) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what the heck is all of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, what do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is a change to fix a problem with resampling for tz-aware series in one of the edge cases
I hit this problem often unfortunately and from my understanding of the design I created a solution that doesn't break other functionality but fixes the problem I have
sorry if the solution doesn't look elegant, if you think there is other one that we can use, please let me know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not really sure what you are doing. Your changes should be restriced to pandas/core/resample.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you mean that I should not change anything in pandas/core/resample.py then unfortunately I don't know how to avoid that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the opposite. There's where the change should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can implement a solution with changes in
pandas/core/resample.py
only I thinklet me try
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I think this is a specific change that resample needs, not generically for date_ranges.