Skip to contents

matchCohorts() generate a new cohort matched to individuals in an existing cohort. Individuals can be matched based on year of birth and sex.

Usage

matchCohorts(
  cohort,
  cohortId = NULL,
  matchSex = TRUE,
  matchYearOfBirth = TRUE,
  ratio = 1,
  keepOriginalCohorts = FALSE,
  name = tableName(cohort)
)

Arguments

cohort

A cohort table in a cdm reference.

cohortId

Vector identifying which cohorts to include (cohort_definition_id or cohort_name). Cohorts not included will be removed from the cohort set.

matchSex

Whether to match in sex.

matchYearOfBirth

Whether to match in year of birth.

ratio

Number of allowed matches per individual in the target cohort.

keepOriginalCohorts

If TRUE the original cohorts will be return together with the new ones. If FALSE only the new cohort will be returned.

name

Name of the new cohort table created in the cdm object.

Value

A cohort table.

Examples

# \donttest{
library(CohortConstructor)
library(dplyr)
#> 
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union
cdm <- mockCohortConstructor(nPerson = 200)
cdm$new_matched_cohort <- cdm$cohort2 |>
  matchCohorts(
    name = "new_matched_cohort",
    cohortId = 2,
    matchSex = TRUE,
    matchYearOfBirth = TRUE,
    ratio = 1)
#> Starting matching
#> Warning: Multiple records per person detected. The matchCohorts() function is designed
#> to operate under the assumption that there is only one record per person within
#> each cohort. If this assumption is not met, each record will be treated
#> independently. As a result, the same individual may be matched multiple times,
#> leading to inconsistent and potentially misleading results.
#>  Creating copy of target cohort.
#>  1 cohort to be matched.
#>  Creating controls cohorts.
#>  Excluding cases from controls
#>  Matching by gender_concept_id and year_of_birth
#>  Removing controls that were not in observation at index date
#>  Excluding target records whose pair is not in observation
#>  Adjusting ratio
#> Binding cohorts
#>  Done
cdm$new_matched_cohort
#> # Source:   table<main.new_matched_cohort> [?? x 5]
#> # Database: DuckDB v1.1.2 [unknown@Linux 6.5.0-1025-azure:R 4.4.2/:memory:]
#>    cohort_definition_id subject_id cohort_start_date cohort_end_date cluster_id
#>                   <int>      <int> <date>            <date>               <dbl>
#>  1                    1        110 2005-10-01        2006-06-12              99
#>  2                    1         19 2015-04-24        2015-09-01             108
#>  3                    1        166 2017-05-16        2017-09-25              59
#>  4                    1         33 1993-05-10        1997-04-01              14
#>  5                    1         16 2007-05-18        2007-10-08             102
#>  6                    1         54 2015-03-29        2016-03-31              18
#>  7                    1         91 1995-05-16        2002-02-02              42
#>  8                    1         30 2008-04-20        2010-01-04              27
#>  9                    1         47 1994-03-23        1997-08-27              89
#> 10                    1         62 2008-04-08        2008-07-26              45
#> # ℹ more rows
# }