Groups are sometimes removed by dplyr functions, such as
dplyr::summarize()
. However, it may be necessary to ensure that the data
output by a function retains as many of the original groups as are available
after applying the function. with_retain_groups()
applies a function to a
grouped data frame and restores the original group as much as possible.
with_retain_groups(.data, .f, ...)
.data | A grouped tbl, tibble, or data.frame |
---|---|
.f | A function, formula, or vector (not necessarily atomic). If a function, it is used as is. If a formula, e.g.
This syntax allows you to create very compact anonymous functions. If character vector, numeric vector, or list, it is
converted to an extractor function. Character vectors index by
name and numeric vectors index by position; use a list to index
by position and name at different levels. If a component is not
present, the value of |
... | Additional arguments passed on to methods. |
Other Group Utilities: group_drop
,
with_ungroup
# with_retain_groups() applies inner function to grouped data frame # and restores grouping on output tidyr::table1 %>% dplyr::group_by(country, year) %>% with_retain_groups(~ dplyr::summarize(., cases = sum(cases)))#>#> # A tibble: 6 x 3 #> # Groups: country, year [6] #> country year cases #> <chr> <int> <int> #> 1 Afghanistan 1999 745 #> 2 Afghanistan 2000 2666 #> 3 Brazil 1999 37737 #> 4 Brazil 2000 80488 #> 5 China 1999 212258 #> 6 China 2000 213766# Groups that "disappear" are implicitly dropped, with a warning tidyr::table1 %>% dplyr::group_by(country, year) %>% with_retain_groups(~ { dplyr::summarize(., r = cases / population) %>% dplyr::summarize(r = mean(r)) })#>#>#> Warning: groups were implicitly dropped: `year`#> # A tibble: 3 x 2 #> # Groups: country [3] #> country r #> <chr> <dbl> #> 1 Afghanistan 0.0000834 #> 2 Brazil 0.000340 #> 3 China 0.000167# Works like "normal" if no groupings are present tidyr::table1 %>% with_retain_groups(~ dplyr::mutate(., r = cases / population))#> # A tibble: 6 x 5 #> country year cases population r #> <chr> <int> <int> <int> <dbl> #> 1 Afghanistan 1999 745 19987071 0.0000373 #> 2 Afghanistan 2000 2666 20595360 0.000129 #> 3 Brazil 1999 37737 172006362 0.000219 #> 4 Brazil 2000 80488 174504898 0.000461 #> 5 China 1999 212258 1272915272 0.000167 #> 6 China 2000 213766 1280428583 0.000167