a07_split_cohorts
a08_split_cohorts.Rmd
For this example we’ll use the Eunomia synthetic data from the CDMConnector package.
con <- DBI::dbConnect(duckdb::duckdb(), dbdir = eunomia_dir())
cdm <- cdm_from_con(con, cdm_schema = "main",
write_schema = c(prefix = "my_study_", schema = "main"))
Let’s start by creating two drug cohorts, one for users of diclofenac and another for users of acetaminophen.
cdm$medications <- conceptCohort(cdm = cdm,
conceptSet = list("diclofenac" = 1124300,
"acetaminophen" = 1127433),
name = "medications")
cohortCount(cdm$medications)
#> # A tibble: 2 × 3
#> cohort_definition_id number_records number_subjects
#> <int> <int> <int>
#> 1 1 9365 2580
#> 2 2 830 830
We can stratify cohorts based on specified columns using the function
stratifyCohorts()
. In this example, let’s stratify the
medications cohort by age and sex.
cdm$stratified <- cdm$medications |>
addAge(ageGroup = list("Child" = c(0,17), "18 to 65" = c(18,64), "65 and Over" = c(65, Inf))) |>
addSex(name = "stratified") |>
stratifyCohorts(strata = list("sex", "age_group", c("sex", "age_group")), name = "stratified")
settings(cdm$stratified)
#> # A tibble: 22 × 10
#> cohort_definition_id cohort_name target_cohort_id target_cohort_name
#> <int> <chr> <int> <chr>
#> 1 1 acetaminophen_female 1 acetaminophen
#> 2 2 acetaminophen_male 1 acetaminophen
#> 3 3 diclofenac_female 2 diclofenac
#> 4 4 diclofenac_male 2 diclofenac
#> 5 5 acetaminophen_18_to… 1 acetaminophen
#> 6 6 acetaminophen_65_an… 1 acetaminophen
#> 7 7 acetaminophen_child 1 acetaminophen
#> 8 8 diclofenac_18_to_65 2 diclofenac
#> 9 9 diclofenac_65_and_o… 2 diclofenac
#> 10 10 diclofenac_child 2 diclofenac
#> # ℹ 12 more rows
#> # ℹ 6 more variables: cdm_version <chr>, vocabulary_version <chr>,
#> # target_cohort_table_name <chr>, strata_columns <chr>, sex <chr>,
#> # age_group <chr>
The age and sex columns are added using functions from the package
PatientProfiles
. The ‘stratified’ table includes 22
cohorts, representing various combinations of sex and age groups.
We can also split cohorts for specified years using the function
yearCohorts()
.
cdm$years <- cdm$medications |>
yearCohorts(years = 2005:2010, name = "years")
settings(cdm$years)
#> # A tibble: 12 × 7
#> cohort_definition_id cohort_name target_cohort_definitio…¹ cdm_version
#> <int> <chr> <int> <chr>
#> 1 1 acetaminophen_2005 1 5.3
#> 2 2 diclofenac_2005 2 5.3
#> 3 3 acetaminophen_2006 1 5.3
#> 4 4 diclofenac_2006 2 5.3
#> 5 5 acetaminophen_2007 1 5.3
#> 6 6 diclofenac_2007 2 5.3
#> 7 7 acetaminophen_2008 1 5.3
#> 8 8 diclofenac_2008 2 5.3
#> 9 9 acetaminophen_2009 1 5.3
#> 10 10 diclofenac_2009 2 5.3
#> 11 11 acetaminophen_2010 1 5.3
#> 12 12 diclofenac_2010 2 5.3
#> # ℹ abbreviated name: ¹target_cohort_definition_id
#> # ℹ 3 more variables: vocabulary_version <chr>, year <int>,
#> # target_cohort_name <chr>
The ‘years’ table includes 12 cohorts, with each cohort representing a specific drug and year.