Skip to content

age_adjust() with keep_age should try harder? #45

@gadenbuie

Description

@gadenbuie

If age groups don't match, then what?

library(tidyverse)
library(fcds)

fcds <- fcds_load()

# work with random subsample
fcds <- fcds %>% group_by(!!!rlang::syms(fcds_vars("demo"))) %>% sample_n(1) %>% ungroup()

If we do the regrouping first, age_adjust() will ultimately fail.

fcds_regrouped <- 
  fcds %>% 
  separate_age_groups() %>% 
  mutate(
    age_group = case_when(
      age_high < 20 ~ "< 20",
      age_high < 50 ~ "20 - 49",
      age_high < 60 ~ "50 - 64",
      age_high < 85 ~ "65 - 84",
      TRUE ~ "85 +"
    ),
    age_group = fct_reorder(age_group, age_low)
  )

fcds_vars(.data = fcds_regrouped, "demo")
#> # A tibble: 14,815 x 8
#>    age_group race  sex   origin marital_status birth_country birth_state
#>    <fct>     <fct> <fct> <fct>  <fct>          <fct>         <fct>      
#>  1 < 20      White Male  Non-H… Married; Unma… US States an… Florida    
#>  2 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  3 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  4 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  5 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  6 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  7 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  8 < 20      White Male  Non-H… Single, Separ… US States an… Other US S…
#>  9 < 20      White Male  Non-H… Single, Separ… US States an… Other US S…
#> 10 < 20      White Male  Non-H… Single, Separ… US States an… Other US S…
#> # … with 14,805 more rows, and 1 more variable: primary_payer <fct>

fcds_regrouped %>% 
  count_fcds() %>% 
  age_adjust(keep_age = TRUE)
#> The age groups in `data` do not match any age groups in
#> `population_standard`.

The current way around this is to do the re-grouping after the age adjustment.

fcds %>% 
  count_fcds() %>% 
  age_adjust(keep_age = TRUE) %>% 
  separate_age_groups() %>%
  group_drop(age_group) %>% 
  mutate(
    age_group = case_when(
      age_high < 20 ~ "< 20",
      age_high < 50 ~ "20 - 49",
      age_high < 60 ~ "50 - 64",
      age_high < 85 ~ "65 - 84",
      TRUE ~ "85 +"
    ),
    age_group = fct_reorder(age_group, age_low)
  ) %>% 
  group_by(age_group, add = TRUE) 
#> # A tibble: 126 x 9
#> # Groups:   year, year_mid, age_group [35]
#>    year  year_mid age_group age_low age_high     n population std_pop
#>    <fct> <chr>    <fct>       <dbl>    <dbl> <int>      <dbl>   <dbl>
#>  1 1981… 1983     < 20            0        4    20     672372  1.90e7
#>  2 1981… 1983     < 20            5        9    17     605665  1.99e7
#>  3 1981… 1983     < 20           10       14    17     712639  2.01e7
#>  4 1981… 1983     < 20           15       19    28     789181  1.98e7
#>  5 1981… 1983     20 - 49        20       24    48     890738  1.83e7
#>  6 1981… 1983     20 - 49        25       29    41     892078  1.77e7
#>  7 1981… 1983     20 - 49        30       34    41     793533  1.95e7
#>  8 1981… 1983     20 - 49        35       39    36     686575  2.22e7
#>  9 1981… 1983     20 - 49        40       44    46     581196  2.25e7
#> 10 1981… 1983     20 - 49        45       49    50     514395  1.98e7
#> # … with 116 more rows, and 1 more variable: w <dbl>

But the re-grouped ages overlap the underlying standard ages, so age_adjust() could have called standardize_age_groups() on the population data relative to the input data to do this for us.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions