Skip to content

DEPR: Change numeric_only to False in various groupby ops #49892

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

rhshadrach
Copy link
Member

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

@rhshadrach rhshadrach added Groupby Deprecate Functionality to remove in pandas Nuisance Columns Identifying/Dropping nuisance columns in reductions, groupby.add, DataFrame.apply labels Nov 24, 2022
Comment on lines -242 to -245
warn = FutureWarning if func == "std" else None
msg = "The default value of numeric_only"
with tm.assert_produces_warning(warn, match=msg):
result = df.groupby(level=1, axis=1).agg(func)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 1.5.x we are currently warning in a few erroneous cases with axis=1. Is this worth fixing? I think the PR would have to go straight into 1.5.x because of the enforced deprecations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say if we haven't gotten an issue about this yet I think it's okay not fixing for now

@@ -572,7 +572,7 @@ Removal of prior version deprecations/changes
- Changed default of ``numeric_only`` to ``False`` in all DataFrame methods with that argument (:issue:`46096`, :issue:`46906`)
- Changed default of ``numeric_only`` to ``False`` in :meth:`Series.rank` (:issue:`47561`)
- Enforced deprecation of silently dropping nuisance columns in groupby and resample operations when ``numeric_only=False`` (:issue:`41475`)
- Changed default of ``numeric_only`` to ``False`` in :meth:`.DataFrameGroupBy.sum` and :meth:`.DataFrameGroupBy.mean` (:issue:`46072`)
- Changed default of ``numeric_only`` to ``False`` in various :class:`.DataFrameGroupBy` methods (:issue:`46072`)
Copy link
Member Author

@rhshadrach rhshadrach Nov 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some ops enforce numeric_only=False, mostly those that use _op_via_apply. Will be changing this to

The default of ``numeric_only`` is now ``False`` across all :class:`.DataFrameGroupBy` methods

once all ops are done.

@rhshadrach rhshadrach added this to the 2.0 milestone Nov 24, 2022
@jbrockmendel
Copy link
Member

does this finish off the nuisance column stuff?

@rhshadrach
Copy link
Member Author

@jbrockmendel

does this finish off the nuisance column stuff?

No - we still have some grouby ops (those that go through _op_via_apply) and then resampler / window.

…rce_numeric_only_groupby_false

� Conflicts:
�	doc/source/whatsnew/v2.0.0.rst
�	pandas/tests/resample/test_resample_api.py
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me

raise TypeError(
f"{type(self).__name__}.sem called with "
f"numeric_only={numeric_only} and dtype {self.obj.dtype}"
)
result = self.std(ddof=ddof, numeric_only=numeric_only_bool)
result = self.std(ddof=ddof, numeric_only=numeric_only)
self._maybe_warn_numeric_only_depr("sem", result, numeric_only)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is self._maybe_warn_numeric_only_depr still needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not, but I have another branch built on this that removes it (and should be the final PR for nuisance deprecations).

@jbrockmendel
Copy link
Member

small question, otherwise lgtm

@mroeschke mroeschke merged commit dd13032 into pandas-dev:main Nov 28, 2022
@mroeschke
Copy link
Member

Thanks @rhshadrach

@rhshadrach rhshadrach deleted the enforce_numeric_only_groupby_false branch November 28, 2022 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Groupby Nuisance Columns Identifying/Dropping nuisance columns in reductions, groupby.add, DataFrame.apply
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants