Generate a new cohort matched cohort
matchCohorts.Rd
matchCohorts()
generate a new cohort matched to individuals in an
existing cohort. Individuals can be matched based on year of birth and sex.
Usage
matchCohorts(
cohort,
cohortId = NULL,
matchSex = TRUE,
matchYearOfBirth = TRUE,
ratio = 1,
name = tableName(cohort)
)
Arguments
- cohort
A cohort table in a cdm reference.
- cohortId
Vector identifying which cohorts to include (cohort_definition_id or cohort_name). Cohorts not included will be removed from the cohort set.
- matchSex
Whether to match in sex.
- matchYearOfBirth
Whether to match in year of birth.
- ratio
Number of allowed matches per individual in the target cohort.
- name
Name of the new cohort table created in the cdm object.
Examples
# \donttest{
library(CohortConstructor)
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
cdm <- mockCohortConstructor(nPerson = 200)
cdm$new_matched_cohort <- cdm$cohort2 |>
matchCohorts(
name = "new_matched_cohort",
cohortId = 2,
matchSex = TRUE,
matchYearOfBirth = TRUE,
ratio = 1)
#> Starting matching
#> Warning: Multiple records per person detected. The matchCohorts() function is designed
#> to operate under the assumption that there is only one record per person within
#> each cohort. If this assumption is not met, each record will be treated
#> independently. As a result, the same individual may be matched multiple times,
#> leading to inconsistent and potentially misleading results.
#> ℹ Creating copy of target cohort.
#> • 1 cohort to be matched.
#> ℹ Creating controls cohorts.
#> ℹ Excluding cases from controls
#> • Matching by gender_concept_id and year_of_birth
#> • Removing controls that were not in observation at index date
#> • Excluding target records whose pair is not in observation
#> • Adjusting ratio
#> Binding both cohorts
#> ✔ Done
cdm$new_matched_cohort
#> # Source: table<main.new_matched_cohort> [?? x 5]
#> # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1025-azure:R 4.4.1/:memory:]
#> cohort_definition_id subject_id cohort_start_date cohort_end_date cluster_id
#> <int> <int> <date> <date> <dbl>
#> 1 1 19 2003-06-22 2005-07-21 36
#> 2 1 84 2000-04-13 2006-06-11 42
#> 3 1 81 2009-08-19 2010-12-05 47
#> 4 1 113 2015-09-05 2016-01-10 50
#> 5 1 94 2017-11-01 2018-07-05 59
#> 6 1 119 1982-09-08 1989-01-29 67
#> 7 1 62 2017-04-27 2018-07-04 74
#> 8 1 120 2009-02-07 2009-12-14 78
#> 9 1 8 1990-10-16 1999-07-20 93
#> 10 1 105 2001-10-24 2002-04-20 105
#> # ℹ more rows
# }