Skip to content

Dropna argument is now respected when false in pivot_table #25738

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 12, 2019

Conversation

alexander-ponomaroff
Copy link
Contributor

@codecov
Copy link

codecov bot commented Mar 15, 2019

Codecov Report

Merging #25738 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25738      +/-   ##
==========================================
- Coverage   91.25%   91.24%   -0.01%     
==========================================
  Files         172      172              
  Lines       52973    52973              
==========================================
- Hits        48338    48337       -1     
- Misses       4635     4636       +1
Flag Coverage Δ
#multiple 89.82% <ø> (ø) ⬆️
#single 41.75% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/reshape/pivot.py 96.55% <ø> (ø) ⬆️
pandas/util/testing.py 88.98% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9eec9b8...1e9ec27. Read the comment docs.

@codecov
Copy link

codecov bot commented Mar 15, 2019

Codecov Report

Merging #25738 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25738      +/-   ##
==========================================
- Coverage   91.89%   91.89%   -0.01%     
==========================================
  Files         175      175              
  Lines       52509    52509              
==========================================
- Hits        48255    48252       -3     
- Misses       4254     4257       +3
Flag Coverage Δ
#multiple 90.45% <ø> (ø) ⬆️
#single 40.74% <ø> (-0.15%) ⬇️
Impacted Files Coverage Δ
pandas/core/reshape/pivot.py 96.54% <ø> (ø) ⬆️
pandas/io/gbq.py 75% <0%> (-12.5%) ⬇️
pandas/core/frame.py 96.79% <0%> (-0.12%) ⬇️
pandas/util/testing.py 90.72% <0%> (+0.1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 90a85e0...dab0395. Read the comment docs.

@gfyoung gfyoung added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Mar 15, 2019
Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify the meaning of dropna? From the docstring:

        dropna : boolean, default True
            Do not include columns whose entries are all NaN

It's not clear (to me) what "columns" is referring to. Input columns? Columns after some internal reshaping? Columns at the very end?

@pep8speaks
Copy link

pep8speaks commented Mar 18, 2019

Hello @alexander-ponomaroff! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-04-12 17:46:52 UTC

@alexander-ponomaroff
Copy link
Contributor Author

Can we clarify the meaning of dropna? From the docstring:

        dropna : boolean, default True
            Do not include columns whose entries are all NaN

It's not clear (to me) what "columns" is referring to. Input columns? Columns after some internal reshaping? Columns at the very end?

The resulting pivot table will not contain any columns with NaN values if the dropna argument is True. I believe would be the answer. I can look into it in more detail if this answer doesn't clarify it or is not specific enough.

@alexander-ponomaroff
Copy link
Contributor Author

@jreback @TomAugspurger Do you have any other requests and/or feedback for this PR?

@jreback
Copy link
Contributor

jreback commented Apr 5, 2019

can you merge master and update

@alexander-ponomaroff
Copy link
Contributor Author

@jreback @WillAyd Fixed requested changes, all green now.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm @jreback

@WillAyd WillAyd added this to the 0.25.0 milestone Apr 6, 2019
True, False
])
def test_pivot_table_aggfunc_dropna(self, dropna):
# GH 22159
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a sufficient test when aggfunc is just a scalar? if not can you add

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I'm not sure what you mean by this. How can aggfunc be a scalar? Could you please elaborate on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexander-ponomaroff for example when aggfunc=np.sum

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd Thank you, it doesn't seem like there is a test with dropna and scalar aggfunc together. So I will add one right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a second test, where aggfunc=np.mean. Please take a look and let me know if it's sufficient.

@jreback jreback merged commit 5a15069 into pandas-dev:master Apr 12, 2019
@jreback
Copy link
Contributor

jreback commented Apr 12, 2019

thanks @alexander-ponomaroff

yhaque1213 pushed a commit to yhaque1213/pandas that referenced this pull request Apr 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pivot_table drops columns for aggregate functions that return all None, even if dropna=False
6 participants