Skip to contents

Runs phenotypeR diagnostics on the cohort. The diganostics include: * Age groups and sex summarised. * A summary of visits of everyone in the cohort using visit_occurrence table. * A summary of age and sex density of the cohort. * Attritions of the cohorts. * Overlap between cohorts (if more than one cohort is being used).

Usage

cohortDiagnostics(cohort, survival = FALSE, match = TRUE, matchedSample = 1000)

Arguments

cohort

Cohort table in a cdm reference

survival

Boolean variable. Whether to conduct survival analysis (TRUE) or not (FALSE).

match

Boolean variable. Whether to conduct the analysis for the matched cohorts (TRUE) or not (FALSE).

matchedSample

Only if match = TRUE. The number of people to take a random sample for matching. If NULL, no sampling will be performed.

Value

A summarised result

Examples

# \donttest{
library(PhenotypeR)

cdm <- mockPhenotypeR()

result <- cohortDiagnostics(cdm$my_cohort,
                            match = TRUE)
#>  Starting Cohort Diagnostics
#> → Getting cohort attrition
#> → Getting cohort count
#>  summarising data
#>  summarising cohort cohort_1
#>  summarising cohort cohort_2
#>  summariseCharacteristics finished!
#> → Getting cohort overlap
#> → Getting cohort timing
#>  The following estimates will be computed:
#>  days_between_cohort_entries: median, q25, q75, min, max, density
#> ! Table is collected to memory as not all requested estimates are supported on
#>   the database side
#> → Start summary of data, at 2025-07-04 09:54:31.839686
#>  Summary finished, at 2025-07-04 09:54:31.970216
#> → Creating matching cohorts
#> → Sampling cohort `my_cohort`
#> Returning entry cohort as the size of the cohorts to be sampled is equal or
#> smaller than `n`.
#>  Generating an age and sex matched cohort for cohort_1
#> Starting matching
#>  Creating copy of target cohort.
#>  1 cohort to be matched.
#>  Creating controls cohorts.
#>  Excluding cases from controls
#>  Matching by gender_concept_id and year_of_birth
#>  Removing controls that were not in observation at index date
#>  Excluding target records whose pair is not in observation
#>  Adjusting ratio
#> Binding cohorts
#>  Done
#> → Sampling cohort `my_cohort`
#> Returning entry cohort as the size of the cohorts to be sampled is equal or
#> smaller than `n`.
#>  Generating an age and sex matched cohort for cohort_2
#> Starting matching
#>  Creating copy of target cohort.
#>  1 cohort to be matched.
#>  Creating controls cohorts.
#>  Excluding cases from controls
#>  Matching by gender_concept_id and year_of_birth
#>  Removing controls that were not in observation at index date
#>  Excluding target records whose pair is not in observation
#>  Adjusting ratio
#> Binding cohorts
#>  Done
#> → Getting cohorts and indexes
#> → Summarising cohort characteristics
#>  adding demographics columns
#>  adding tableIntersectCount 1/1
#> window names casted to snake_case:
#>  `-365 to -1` -> `365_to_1`
#>  summarising data
#>  summarising cohort cohort_1
#>  summarising cohort cohort_2
#>  summarising cohort cohort_1_sampled
#>  summarising cohort cohort_1_matched
#>  summarising cohort cohort_2_sampled
#>  summarising cohort cohort_2_matched
#>  summariseCharacteristics finished!
#> → Calculating age density
#>  The following estimates will be computed:
#>  age: density
#> → Start summary of data, at 2025-07-04 09:54:56.235118
#>  Summary finished, at 2025-07-04 09:54:56.564309
#> → Run large scale characteristics (including source and standard codes)
#>  Summarising large scale characteristics 
#>  - getting characteristics from table condition_occurrence (1 of 6)
#>  - getting characteristics from table visit_occurrence (2 of 6)
#>  - getting characteristics from table measurement (3 of 6)
#>  - getting characteristics from table procedure_occurrence (4 of 6)
#>  - getting characteristics from table observation (5 of 6)
#>  - getting characteristics from table drug_exposure (6 of 6)
#> Formatting result
#>  Summarising large scale characteristics
#> → Run large scale characteristics (including only standard codes)
#>  Summarising large scale characteristics 
#>  - getting characteristics from table condition_occurrence (1 of 6)
#>  - getting characteristics from table visit_occurrence (2 of 6)
#>  - getting characteristics from table measurement (3 of 6)
#>  - getting characteristics from table procedure_occurrence (4 of 6)
#>  - getting characteristics from table observation (5 of 6)
#>  - getting characteristics from table drug_exposure (6 of 6)
#> Formatting result
#>  Summarising large scale characteristics

CDMConnector::cdmDisconnect(cdm = cdm)
# }