Applying cohort restrictions • CohortConstructor

For this example we’ll use the Eunomia synthetic data from the CDMConnector package.

library(CDMConnector)
library(CohortConstructor)
con <- DBI::dbConnect(duckdb::duckdb(), dbdir = eunomia_dir())
cdm <- cdm_from_con(con, cdm_schema = "main", 
                    write_schema = c(prefix = "my_study_", schema = "main"))

Let’s start by creating two drug cohorts, one for users of diclofenac and another for users of acetaminophen.

cdm$medications <- conceptCohort(cdm = cdm, 
                                 conceptSet = list("diclofenac" = 1124300,
                                                   "acetaminophen" = 1127433), 
                                 name = "medications")
cohortCount(cdm$medications)
#> # A tibble: 2 × 3
#>   cohort_definition_id number_records number_subjects
#>                  <int>          <int>           <int>
#> 1                    1           9365            2580
#> 2                    2            830             830

As well as our medication cohorts, let’s also make another cohort containing individuals with a record of a GI bleed. Later we’ll use this cohort when specifying inclusion/ exclusion criteria.

cdm$gi_bleed <- conceptCohort(cdm = cdm,  
                              conceptSet = list("gi_bleed" = 192671),
                              name = "gi_bleed")

Keep only the first record per person

Individuals can contribute multiple records per cohort. However now we’ll keep only their earliest cohort entry of the remaining records using requireIsFirstEntry() from CohortConstructor. We can see that after this we have one record per person for each cohort.

cdm$medications <- cdm$medications %>% 
  requireIsFirstEntry()

cohortCount(cdm$medications)
#> # A tibble: 2 × 3
#>   cohort_definition_id number_records number_subjects
#>                  <int>          <int>           <int>
#> 1                    1           2580            2580
#> 2                    2            830             830

Note, applying this criteria later after applying other criteria would result in a different result. Here we’re requiring that individuals meet inclusion criteria at the time of their first use of diclofenac or acetaminophen.

Applying restrictions on patient demographics

Using requireDemographics() we’ll require that individuals in our medications cohort are female and, relative to their cohort start date, are between 18 and 85 with at least 30 days of prior observation time in the database.

cdm$medications <- cdm$medications %>% 
  requireDemographics(indexDate = "cohort_start_date", 
                      ageRange = list(c(18, 85)),
                      sex = "Female", 
                      minPriorObservation = 30)

We can then see how many people have people have been excluded based on these demographic requirements.

cohort_attrition(cdm$medications) %>% 
  dplyr::filter(reason == "Demographic requirements") %>% 
  dplyr::glimpse()
#> Rows: 0
#> Columns: 7
#> $ cohort_definition_id <int> 
#> $ number_records       <int> 
#> $ number_subjects      <int> 
#> $ reason_id            <int> 
#> $ reason               <chr> 
#> $ excluded_records     <int> 
#> $ excluded_subjects    <int>

Restrictions on calendar dates

Next we can use requireInDateRange() to keep only those records where cohort entry was between a particular date range.

cdm$medications <- cdm$medications %>% 
  requireInDateRange(indexDate = "cohort_start_date", 
                     dateRange = as.Date(c("2000-01-01", "2015-01-01")))

Again, we can track cohort attrition

cohort_attrition(cdm$medications) %>% 
  dplyr::filter(reason == "cohort_start_date between 2000-01-01 and 2015-01-01") %>% 
  dplyr::glimpse()
#> Rows: 0
#> Columns: 7
#> $ cohort_definition_id <int> 
#> $ number_records       <int> 
#> $ number_subjects      <int> 
#> $ reason_id            <int> 
#> $ reason               <chr> 
#> $ excluded_records     <int> 
#> $ excluded_subjects    <int>

Restrictions on cohort presence

We could require that individuals in our medication cohorts have a history of GI bleed. To do this we can use the requireCohortIntersect() function, requiring that individuals have one or more intersections with the GI bleed cohort.

cdm$medications_gi_bleed <- cdm$medications  %>%
  requireCohortIntersect(intersections = c(1,Inf),
                         targetCohortTable = "gi_bleed", 
                         targetCohortId = 1,
                         indexDate = "cohort_start_date", 
                         window = c(-Inf, 0), 
                         name = "medications_gi_bleed")
cohort_count(cdm$medications_gi_bleed)
#> # A tibble: 2 × 3
#>   cohort_definition_id number_records number_subjects
#>                  <int>          <int>           <int>
#> 1                    1              0               0
#> 2                    2              0               0

Instead of requiring that individuals have history of GI bleed, we could instead require that they are don’t have any history of it. In this case we can again use the requireCohortIntersect() function, but this time set the intersections argument to 0 to require individuals’ absence in this other cohort rather than their presence in it.

cdm$medications_no_gi_bleed <- cdm$medications %>%
  requireCohortIntersect(intersections = 0,
                         targetCohortTable = "gi_bleed", 
                         targetCohortId = 1,
                         indexDate = "cohort_start_date", 
                         window = c(-Inf, 0), 
                         name = "medications_no_gi_bleed") 
cohort_count(cdm$medications_no_gi_bleed)
#> # A tibble: 2 × 3
#>   cohort_definition_id number_records number_subjects
#>                  <int>          <int>           <int>
#> 1                    1            101             101
#> 2                    2            179             179