Skip to content

Update dplyr functions to use non-standard evaluation #1800

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from
Closed

Update dplyr functions to use non-standard evaluation #1800

wants to merge 10 commits into from

Conversation

julistanley
Copy link
Contributor

Replaced all instances of depreciated standard evaluation dplyr verbs with their NSE equivalents.

Fixes associated depreciation warnings, so closes #1783. For example,

Before:

> p <- ggplot(txhousing, aes(x = date, y = median, group = city)) +
+   geom_line(alpha = 0.3)
> ggplotly(p) %>% filter(city == "Houston")
Warning message:
`filter_()` is deprecated as of dplyr 0.7.0.
Please use `filter()` instead.
See vignette('programming') for more help

After:

> p <- ggplot(txhousing, aes(x = date, y = median, group = city)) +
+   geom_line(alpha = 0.3)
> ggplotly(p) %>% filter(city == "Houston")

This PR introduces a util function called as_quosures_explicit. If/when issue 1009 in r-lang is resolved, just replace all instances of as_quosures_explicit with rlang::as_quosures.

julianstanley and others added 3 commits June 22, 2020 20:10
For some reason, all tests passed without catching this mistake, but I think I should have had a !! here.
@julistanley
Copy link
Contributor Author

julistanley commented Jun 23, 2020

Build 4813.1 and 4813.2 passed (devel and oldrel). 4813.3 failed:

══ Failed ══════════════════════════════════════════════════════════════════���═══
── 1. Error: Complex example works (@test-plotly.R#179)  ───────────────────────
Column `.add` can't be modified because it's a grouping variable
Backtrace:
  1. plotly::plot_ly(., x = ~date, y = ~median)
  1. dplyr::group_by(., city)
  1. plotly::add_lines(., alpha = 0.2, name = "Texan Cities", hoverinfo = "none")
  8. dplyr::group_by(., date)
 11. plotly::group_by_.plotly(...)
 13. dplyr:::group_by.data.frame(d, !!!additional_args, .add = add)
 14. dplyr::group_by_prepare(.data, ..., add = add)
 15. dplyr:::add_computed_columns(.data, new_groups)
 17. dplyr:::mutate.tbl_df(.data, !!!mutate_vars)
 18. dplyr:::mutate_impl(.data, dots, caller_env())

Related to this StackOverflow post, maybe? Edit: or maybe just because I missed an add-->.add :o 580cc76.

Co-authored-by: Carson Sievert <cpsievert1@gmail.com>
As is done here https://github.com/r-lib/rlang/blob/64df8e3f/R/compat-lazyeval.R#L51

Co-authored-by: Carson Sievert <cpsievert1@gmail.com>
@cpsievert
Copy link
Collaborator

It'd also be great to have a few unit tests to make sure these cases are covered, something like:

expect_warning(
  plot_ly(txhousing) %>% 
    group_by_("city") %>% 
    add_lines(x = ~date, y = ~median),
  NA
)

@julistanley
Copy link
Contributor Author

julistanley commented Jun 23, 2020

Weirdly, I am still getting a warning. Thinking it might have been something weird with my system, I switched to a blank RStudio Cloud instance. (also using lifecycle_verbosity = "warning" but wasn't before--which is why I think I didn't see a warning in the first PR comment)

I clone my fork, then switch to the patch-1 branch, and install all dependencies. Then:

> devtools::load_all(".")
Loading plotly
Visual testing is not enabled.
> options(lifecycle_verbosity = "warning") 
> txhousing %>% plot_ly(x = ~date, y = ~median) %>% group_by(city)
No trace type specified:
  Based on info supplied, a 'scatter' trace seems appropriate.
  Read more about this trace type -> https://plot.ly/r/reference/#scatter
No scatter mode specifed:
  Setting the mode to markers
  Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode
Warning messages:
1: `group_by_()` is deprecated as of dplyr 0.7.0.
Please use `group_by()` instead.
See vignette('programming') for more help 
2: Ignoring 616 observations 

In the terminal, I don't see any instances of group_by_( in the R directory:

rstudio-user@application-2493362-deployment-6489105-2hz46:/cloud/project/plotly/R$ grep -r "group_by_("
plotly_data.R:    #dat <- dplyr::group_by_(dat, crosstalk_key(), add = TRUE)
rstudio-user@application-2493362-deployment-6489105-2hz46:/cloud/project/plotly/R$

There are a few in tests, but I don't think those should interfere:

rstudio-user@application-2493362-deployment-6489105-2hz46:/cloud/project/plotly/tests$ grep -r "group_by_("
testthat/test-plotly-group.R:  dplyr::group_by_("rowname") %>%
testthat/test-ggplot-errorbar.R:    dplyr::group_by_(mtcars, "cyl"),
testthat/test-plotly-data.R:  d <- plotly_data(group_by_(plot_ly(mtcars), c("vs", "am")))

@cpsievert I'm glad that you suggested writing some tests since they caught this. Do you know what might be happening here?

Edit: Made the instance public, if that helps: https://rstudio.cloud/project/1409602

@julistanley
Copy link
Contributor Author

Oh cool, here's a much more minimal and illustrative example:

> txhousing %>% group_by(city) %>% print(n=3)
# A tibble: 8,602 x 9
# Groups:   city [46]
  city     year month sales  volume median listings inventory  date
  <chr>   <int> <int> <dbl>   <dbl>  <dbl>    <dbl>     <dbl> <dbl>
1 Abilene  2000     1    72 5380000  71400      701       6.3 2000 
2 Abilene  2000     2    98 6505000  58700      746       6.6 2000.
3 Abilene  2000     3   130 9285000  58100      784       6.8 2000.
# … with 8,599 more rows

> txhousing %>% plot_ly() %>% group_by(city) %>% print(n=3)
Error: (converted from warning) `group_by_()` is deprecated as of dplyr 0.7.0.
Please use `group_by()` instead.
See vignette('programming') for more help

> txhousing %>% plot_ly() %>% plotly_data() %>% group_by(city) %>% print(n=3)
# A tibble: 8,602 x 9
# Groups:   city [46]
  city     year month sales  volume median listings inventory  date
  <chr>   <int> <int> <dbl>   <dbl>  <dbl>    <dbl>     <dbl> <dbl>
1 Abilene  2000     1    72 5380000  71400      701       6.3 2000 
2 Abilene  2000     2    98 6505000  58700      746       6.6 2000.
3 Abilene  2000     3   130 9285000  58100      784       6.8 2000.
# … with 8,599 more rows

What happens to a plot_ly object when it's passed to dplyr::group_by? How is it different than passing to plotly_data() and then to dplyr::group_by()?

Related to dplyr #5083, I think? tidyverse/dplyr#5083

@cpsievert
Copy link
Collaborator

cpsievert commented Jul 16, 2020

The difference in txhousing %>% plot_ly() %>% group_by(city) and txhousing %>% plot_ly() %>% plotly_data() %>% group_by(city) is that the former winds up calling group_by_.plotly() (since the input's class is plotly).

@cpsievert
Copy link
Collaborator

Ah, I see why txhousing %>% plot_ly() %>% group_by(city) still throws a warning. It's because group_by(plot_ly()) currently winds up calling group_by_.plotly(), and by that time, the warning has been thrown in the generic group_by_()

> library(dplyr)
> methods(group_by)
[1] group_by.data.frame* group_by.default*   
see '?methods' for accessing help and source code

> getS3method("group_by", "default")
function (.data, ..., add = FALSE, .drop = group_by_drop_default(.data)) 
{
    group_by_(.data, .dots = compat_as_lazy_dots(...), add = add)
}

> group_by_
function (.data, ..., .dots = list(), add = FALSE) 
{
    lazy_deprec("group_by")
    UseMethod("group_by_")
}

This means we'll have to define methods for the non-underscore generic functions found in R/plotly_data.R (i.e., we should be defining/registering group_by.plotly()). Do you feel up to that task @julianstanley?

@julistanley
Copy link
Contributor Author

julistanley commented Jul 16, 2020

Thanks for following-up on this @cpsievert!

I think that makes sense. If I understand correctly, group_by (generally, [dplyr-verb]) methods are defined on data.frame ([dplyr-verb].data.frame ), but default otherwise ([dplyr-verb].default). So Plotly objects trigger [dplyr-verb].default, leading to a warning, but they'll trigger [dplyr-verb].plotly if we define that. So we just need to define each NSE dplyr method for plotly objects ([dplyr-verb].plotly).

Sure thing, this sounds simple enough (famous last words?). I'll give this a shot!

@cpsievert
Copy link
Collaborator

Thanks for jump starting this effort @julianstanley, I'll take it over the finish line in #1821

@cpsievert cpsievert closed this Jul 21, 2020
@julistanley julistanley deleted the patch-1 branch July 21, 2020 23:12
@julistanley
Copy link
Contributor Author

julistanley commented Jul 21, 2020

Sounds good, thanks, @cpsievert!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Warning: arrange_() is deprecated as of dplyr 0.7.0. Please use arrange() instead.
2 participants