Introduction
Let’s first create a cdm reference to the Eunomia synthetic data.
library(CDMConnector)
library(CodelistGenerator)
library(PatientProfiles)
library(CohortConstructor)
library(dplyr)
con <- DBI::dbConnect(duckdb::duckdb(),
dbdir = eunomia_dir())
cdm <- cdm_from_con(con, cdm_schema = "main",
write_schema = c(prefix = "my_study_", schema = "main"))
Concept based cohort creation
A way of defining base cohorts is to identify clinical records with codes from some pre-specified list. Here for example we’ll first find codes for diclofenac and acetaminophen.
drug_codes <- getDrugIngredientCodes(cdm,
name = c("acetaminophen",
"amoxicillin",
"diclofenac",
"simvastatin",
"warfarin"))
drug_codes
#>
#> - 11289_warfarin (2 codes)
#> - 161_acetaminophen (7 codes)
#> - 3355_diclofenac (1 codes)
#> - 36567_simvastatin (2 codes)
#> - 723_amoxicillin (4 codes)
Now we have our codes of interest, we’ll make cohorts for each of these where cohort exit is defined as the event start date (which for these will be their drug exposure end date).
cdm$drugs <- conceptCohort(cdm,
conceptSet = drug_codes,
exit = "event_end_date",
name = "drugs")
settings(cdm$drugs)
#> # A tibble: 5 × 4
#> cohort_definition_id cohort_name cdm_version vocabulary_version
#> <int> <chr> <chr> <chr>
#> 1 1 11289_warfarin 5.3 v5.0 18-JAN-19
#> 2 2 161_acetaminophen 5.3 v5.0 18-JAN-19
#> 3 3 3355_diclofenac 5.3 v5.0 18-JAN-19
#> 4 4 36567_simvastatin 5.3 v5.0 18-JAN-19
#> 5 5 723_amoxicillin 5.3 v5.0 18-JAN-19
cohortCount(cdm$drugs)
#> # A tibble: 5 × 3
#> cohort_definition_id number_records number_subjects
#> <int> <int> <int>
#> 1 1 137 137
#> 2 2 13908 2679
#> 3 3 830 830
#> 4 4 182 182
#> 5 5 4307 2130
attrition(cdm$drugs)
#> # A tibble: 20 × 7
#> cohort_definition_id number_records number_subjects reason_id reason
#> <int> <int> <int> <int> <chr>
#> 1 1 137 137 1 Initial qualif…
#> 2 1 137 137 2 Record start <…
#> 3 1 137 137 3 Record in obse…
#> 4 1 137 137 4 Collapse overl…
#> 5 2 14205 2679 1 Initial qualif…
#> 6 2 14205 2679 2 Record start <…
#> 7 2 14205 2679 3 Record in obse…
#> 8 2 13908 2679 4 Collapse overl…
#> 9 3 850 850 1 Initial qualif…
#> 10 3 850 850 2 Record start <…
#> 11 3 830 830 3 Record in obse…
#> 12 3 830 830 4 Collapse overl…
#> 13 4 182 182 1 Initial qualif…
#> 14 4 182 182 2 Record start <…
#> 15 4 182 182 3 Record in obse…
#> 16 4 182 182 4 Collapse overl…
#> 17 5 4309 2130 1 Initial qualif…
#> 18 5 4309 2130 2 Record start <…
#> 19 5 4309 2130 3 Record in obse…
#> 20 5 4307 2130 4 Collapse overl…
#> # ℹ 2 more variables: excluded_records <int>, excluded_subjects <int>
Demographic based cohort creation
One base cohort we can create is based around patient demographics. Here for example we create a cohort where people enter on their 18th birthday and leave at on the day before their 66th birthday.
cdm$working_age_cohort <- demographicsCohort(cdm = cdm,
ageRange = c(18, 65),
name = "working_age_cohort")