This function formats age_min and age_max column into a new column, by default called age_group, with a formatted string containing the age group label denotes by the age boundaries.

format_age_groups(data, age_min = age_min, age_max = age_max,
  into = "age_group", missing_age_group = "Unknown",
  skip_overlap_check = FALSE)

Arguments

data

A data frame containing the columns indicated by age_min and age_max.

age_min

The minimum age (inclusive) of the age group. May be -Inf or 0.

age_max

The maximum age (inclusive) of the age group. May be Inf.

into

The column name as a character string where the age group labels will be stored in the input data frame.

missing_age_group

The value for any age group with a missing boundary.

skip_overlap_check

If TRUE, will skip the check ensuring that age group upper boundaries do not match age group lower boundaries for any age group other that those that are singular ages.

Value

A data frame with an additional column containing the age group labels, named according to into.

Age Formatting Guidelines

Ages are formatted according to the following rules.

  1. Ages range from 0 to Inf. Any age below 0 will be set to 0. Setting an age min boundary to -Inf is a resonable way of denoting all ages under age_max, but will result in age groups starting with 0.

  2. Age groups with an upper boundary of Inf will become "{age_min}+". For example, if age_min is 85, then the age group label is 85+. Use Inf to denote all ages greater than or equal to age_min.

  3. Single age groups are allowed, but there cannot otherwise be overlap between the age boundaries.

  4. Age boundaries are whole integers. Any partial ages are rounded down to the highest integer less than the age boundary.

  5. Missing age boundaries on either side result in an age group label of missing_age_group, which is by default "Unknown".

See also

Examples

d_age_group <- dplyr::tibble( age_min = c(-Inf, 20, 85), age_max = c(19, 84, Inf) ) format_age_groups(d_age_group)
#> # A tibble: 3 x 3 #> age_min age_max age_group #> <dbl> <dbl> <chr> #> 1 -Inf 19 0 - 19 #> 2 20 84 20 - 84 #> 3 85 Inf 85+
# format_age_groups() is the inverse of separate_age_groups() d_age_group %>% format_age_groups() %>% dplyr::select(age_group) %>% separate_age_groups()
#> # A tibble: 3 x 3 #> age_group age_min age_max #> <chr> <dbl> <dbl> #> 1 0 - 19 0 19 #> 2 20 - 84 20 84 #> 3 85+ 85 Inf