Groups are sometimes removed by dplyr functions, such as dplyr::summarize(). However, it may be necessary to ensure that the data output by a function retains as many of the original groups as are available after applying the function. with_retain_groups() applies a function to a grouped data frame and restores the original group as much as possible.

with_retain_groups(.data, .f, ...)

Arguments

.data

A grouped tbl, tibble, or data.frame

.f

A function, formula, or vector (not necessarily atomic).

If a function, it is used as is.

If a formula, e.g. ~ .x + 2, it is converted to a function. There are three ways to refer to the arguments:

  • For a single argument function, use .

  • For a two argument function, use .x and .y

  • For more arguments, use ..1, ..2, ..3 etc

This syntax allows you to create very compact anonymous functions.

If character vector, numeric vector, or list, it is converted to an extractor function. Character vectors index by name and numeric vectors index by position; use a list to index by position and name at different levels. If a component is not present, the value of .default will be returned.

...

Additional arguments passed on to methods.

See also

Other Group Utilities: group_drop, with_ungroup

Examples

# with_retain_groups() applies inner function to grouped data frame # and restores grouping on output tidyr::table1 %>% dplyr::group_by(country, year) %>% with_retain_groups(~ dplyr::summarize(., cases = sum(cases)))
#> `summarise()` regrouping output by 'country' (override with `.groups` argument)
#> # A tibble: 6 x 3 #> # Groups: country, year [6] #> country year cases #> <chr> <int> <int> #> 1 Afghanistan 1999 745 #> 2 Afghanistan 2000 2666 #> 3 Brazil 1999 37737 #> 4 Brazil 2000 80488 #> 5 China 1999 212258 #> 6 China 2000 213766
# Groups that "disappear" are implicitly dropped, with a warning tidyr::table1 %>% dplyr::group_by(country, year) %>% with_retain_groups(~ { dplyr::summarize(., r = cases / population) %>% dplyr::summarize(r = mean(r)) })
#> `summarise()` regrouping output by 'country' (override with `.groups` argument)
#> `summarise()` ungrouping output (override with `.groups` argument)
#> Warning: groups were implicitly dropped: `year`
#> # A tibble: 3 x 2 #> # Groups: country [3] #> country r #> <chr> <dbl> #> 1 Afghanistan 0.0000834 #> 2 Brazil 0.000340 #> 3 China 0.000167
# Works like "normal" if no groupings are present tidyr::table1 %>% with_retain_groups(~ dplyr::mutate(., r = cases / population))
#> # A tibble: 6 x 5 #> country year cases population r #> <chr> <int> <int> <int> <dbl> #> 1 Afghanistan 1999 745 19987071 0.0000373 #> 2 Afghanistan 2000 2666 20595360 0.000129 #> 3 Brazil 1999 37737 172006362 0.000219 #> 4 Brazil 2000 80488 174504898 0.000461 #> 5 China 1999 212258 1272915272 0.000167 #> 6 China 2000 213766 1280428583 0.000167