Skip to content

use liboffsets to_offset function #21693

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 2, 2018
Merged

Conversation

jbrockmendel
Copy link
Member

No description provided.

@codecov
Copy link

codecov bot commented Jul 1, 2018

Codecov Report

Merging #21693 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #21693   +/-   ##
=======================================
  Coverage    91.9%    91.9%           
=======================================
  Files         154      154           
  Lines       49657    49657           
=======================================
  Hits        45638    45638           
  Misses       4019     4019
Flag Coverage Δ
#multiple 90.28% <ø> (ø) ⬆️
#single 41.9% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dc45fba...b2fa7cb. Read the comment docs.

@jreback jreback added Frequency DateOffsets Clean labels Jul 2, 2018
@jreback jreback added this to the 0.24.0 milestone Jul 2, 2018
@jreback jreback merged commit d105ce2 into pandas-dev:master Jul 2, 2018
@jreback
Copy link
Contributor

jreback commented Jul 2, 2018

thanks!

@jbrockmendel jbrockmendel deleted the to_offset branch July 2, 2018 23:37
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
@qwhelan
Copy link
Contributor

qwhelan commented Nov 7, 2018

While investigating general Period slowness, an asv find bisection identified this PR as introducing a ~56x slowdown on Windows 10:

asv find v0.23.4..24ab22f75a groupby.Datelike.time_sum
[...]
·· For pandas commit dc45fbaf <master~690>:
·· Uninstalling from conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Building dc45fbaf <master~690> for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Installing dc45fbaf <master~690> into conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
··· Running (groupby.Datelike.time_sum--).
··· groupby.Datelike.time_sum                                                                                                                                                   ok
··· =============== ==========
        grouper
    --------------- ----------
      period_range   46.9±8ms
       date_range    3.12±0ms
     date_range_tz   3.91±1ms
    =============== ==========

· Testing ------------O-----------------------------------------------
·· For pandas commit d105ce26 <master~689>:
·· Uninstalling from conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Building d105ce26 <master~689> for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Installing d105ce26 <master~689> into conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
··· Running (groupby.Datelike.time_sum--).
··· groupby.Datelike.time_sum                                                                                                                                                   ok
··· =============== ===========
        grouper
    --------------- -----------
      period_range   2.63±0.3s
       date_range     3.12±0ms
     date_range_tz    3.12±0ms
    =============== ===========

· Greatest regression found: d105ce26 <master~689>

My suspicion is the moving of from pandas.tseries.frequencies import to_offset from period.pyx module scope into offset.pyx in function scope is leading to this slowdown on Windows.

@jbrockmendel
Copy link
Member Author

Thanks for tracking this down. A couple questions: is there anything weird about your Windows environment, e.g. running on top of cygwin or something? Or more generally since I don't use Windows, any other reason why your system might be special? Long-shot since you're already doing a lot here: can you reproduce these results on non-Windows?

If you re-run the same asv a few times, are the results pretty consistent? Often asv results can have a lot of noise.

@qwhelan
Copy link
Contributor

qwhelan commented Nov 7, 2018

  • Nothing unusual from a Windows perspective: relatively clean Windows 10 install with benchmarks run under PowerShell. Only standard antivirus, with benchmarks excluded from AV checks.
  • None that come to mind, actually did a fresh install a few months back due to upgrading the SSD and strictly used this machine for remote desktop until recently (so very little installed on it).
  • Actually, it also looks like the trend is visible on my Ubuntu 18.04 machine, so seems like it might be a cross-platform issue. Haven't run a bisection yet, but could this evening.

And yes, sadly consistent.

@@ -538,7 +538,8 @@ def pxd(name):
'_libs/tslibs/ccalendar',
'_libs/tslibs/timedeltas',
'_libs/tslibs/timezones',
'_libs/tslibs/nattype'],
'_libs/tslibs/nattype',
'_libs/tslibs/offsets'],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I thought we had made the "pxdfiles" obsolete when moving to cythonize. Am I remembering wrong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, thought I was commenting on an open PR, not an old one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants