Skip to content

PERF: Remove unnecessary asof join functions #46943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 6, 2022

Conversation

wlach
Copy link
Contributor

@wlach wlach commented May 4, 2022

Several functions in the join cython are module are basically just calling others, and can probably be removed. This wouldn't
be a big deal inside python, but in this case it cuts down considerably on the amount of generated cython code.

The tests pass for me locally when I do this and there seems to be significant space savings.

@pep8speaks
Copy link

pep8speaks commented May 4, 2022

Hello @wlach! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-05-05 11:37:26 UTC

@wlach wlach force-pushed the consolidate-asof-functions branch from 5c27aa2 to 181f9e4 Compare May 4, 2022 20:12
@wlach wlach changed the title consolidate asof functions PERF: Remove unnecessary(?) asof join functions May 4, 2022
@jbrockmendel
Copy link
Member

significant space savings

cool, how much?

Several functions in the join cython are module are basically
just calling others, and can probably be removed. This wouldn't
be a big deal inside python, but in this case it cuts down
considerably on the number of generated cython code.

The tests pass for me locally when I do this and there seems
to be significant space savings. Will provide more detail after
I see CI passing (want to make sure I'm not missing something
first)
@wlach wlach force-pushed the consolidate-asof-functions branch from 181f9e4 to 34d69ff Compare May 4, 2022 20:16
@wlach
Copy link
Contributor Author

wlach commented May 4, 2022

significant space savings

cool, how much?

I think I was seeing about a megabyte on join.pyx, although I think there might be the potential for further gains.

Unfortunately I wasn't running the tests properly so I didn't notice that my PR didn't actually work. 😦 Need to dig in a bit further.

@wlach
Copy link
Contributor Author

wlach commented May 5, 2022

Ok, think I got this working now. It's not as big an improvement as I thought (about 100k or ~4% off of the generated .so for join, not sure why I got it wrong) but I don't think it impacts readability (arguably it helps) so it seems worth doing.

Before

After

  • 2508880 bytes
  • nm: 2873 symbols

@wlach
Copy link
Contributor Author

wlach commented May 5, 2022

Ok, I think this is ready for a look!

docstring and typing information is giving an error but I don't see how it could be related to the code changes here (a very similar pull request from last night was passing):

Error: /home/runner/work/pandas/pandas/pandas/core/generic.py:2529:PR03:pandas.DataFrame.to_hdf:Wrong parameters order. Actual: ('path_or_buf', 'key', 'mode', 'complevel', 'complib', 'append', 'format', 'index', 'min_itemsize', 'nan_rep', 'dropna', 'data_columns', 'errors', 'encoding'). Documented: ('path_or_buf', 'key', 'mode', 'complevel', 'complib', 'append', 'format', 'index', 'errors', 'encoding', 'min_itemsize', 'nan_rep', 'dropna', 'data_columns')

@wlach wlach marked this pull request as ready for review May 5, 2022 12:54
@jreback
Copy link
Contributor

jreback commented May 5, 2022

Ok, I think this is ready for a look!

docstring and typing information is giving an error but I don't see how it could be related to the code changes here (a very similar pull request from last night was passing):

Error: /home/runner/work/pandas/pandas/pandas/core/generic.py:2529:PR03:pandas.DataFrame.to_hdf:Wrong parameters order. Actual: ('path_or_buf', 'key', 'mode', 'complevel', 'complib', 'append', 'format', 'index', 'min_itemsize', 'nan_rep', 'dropna', 'data_columns', 'errors', 'encoding'). Documented: ('path_or_buf', 'key', 'mode', 'complevel', 'complib', 'append', 'format', 'index', 'errors', 'encoding', 'min_itemsize', 'nan_rep', 'dropna', 'data_columns')

this is unrelated

@jreback jreback added Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels May 6, 2022
@jreback jreback added this to the 1.5 milestone May 6, 2022
@jreback jreback merged commit b19c203 into pandas-dev:main May 6, 2022
@jreback
Copy link
Contributor

jreback commented May 6, 2022

thanks @wlach

@wlach wlach deleted the consolidate-asof-functions branch May 6, 2022 21:37
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
@wlach wlach changed the title PERF: Remove unnecessary(?) asof join functions PERF: Remove unnecessary asof join functions Sep 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants