This function formats age_min
and age_max
column into a new column, by
default called age_group
, with a formatted string containing the age group
label denotes by the age boundaries.
format_age_groups(data, age_min = age_min, age_max = age_max, into = "age_group", missing_age_group = "Unknown", skip_overlap_check = FALSE)
data | A data frame containing the columns indicated by |
---|---|
age_min | The minimum age (inclusive) of the age group. May be |
age_max | The maximum age (inclusive) of the age group. May be |
into | The column name as a character string where the age group labels will be stored in the input data frame. |
missing_age_group | The value for any age group with a missing boundary. |
skip_overlap_check | If |
A data frame with an additional column containing the age group
labels, named according to into
.
Ages are formatted according to the following rules.
Ages range from 0
to Inf
. Any age below 0
will be set to 0
.
Setting an age min boundary to -Inf
is a resonable way of denoting all
ages under age_max
, but will result in age groups starting with 0
.
Age groups with an upper boundary of Inf
will become "{age_min}+
".
For example, if age_min
is 85, then the age group label is 85+
. Use
Inf
to denote all ages greater than or equal to age_min
.
Single age groups are allowed, but there cannot otherwise be overlap between the age boundaries.
Age boundaries are whole integers. Any partial ages are rounded down to the highest integer less than the age boundary.
Missing age boundaries on either side result in an age group label of
missing_age_group
, which is by default "Unknown"
.
Other age processors: complete_age_groups
,
filter_age_groups
,
recode_age_groups
,
separate_age_groups
,
standardize_age_groups
d_age_group <- dplyr::tibble( age_min = c(-Inf, 20, 85), age_max = c(19, 84, Inf) ) format_age_groups(d_age_group)#> # A tibble: 3 x 3 #> age_min age_max age_group #> <dbl> <dbl> <chr> #> 1 -Inf 19 0 - 19 #> 2 20 84 20 - 84 #> 3 85 Inf 85+# format_age_groups() is the inverse of separate_age_groups() d_age_group %>% format_age_groups() %>% dplyr::select(age_group) %>% separate_age_groups()#> # A tibble: 3 x 3 #> age_group age_min age_max #> <chr> <dbl> <dbl> #> 1 0 - 19 0 19 #> 2 20 - 84 20 84 #> 3 85+ 85 Inf