
Run cohort-level diagnostics
cohortDiagnostics.Rd
Runs phenotypeR diagnostics on the cohort. The diganostics include: * Age groups and sex summarised. * A summary of visits of everyone in the cohort using visit_occurrence table. * A summary of age and sex density of the cohort. * Attritions of the cohorts. * Overlap between cohorts (if more than one cohort is being used).
Arguments
- cohort
Cohort table in a cdm reference
- survival
Boolean variable. Whether to conduct survival analysis (TRUE) or not (FALSE).
- match
Boolean variable. Whether to conduct the analysis for the matched cohorts (TRUE) or not (FALSE).
- matchedSample
Only if match = TRUE. The number of people to take a random sample for matching. If NULL, no sampling will be performed.
Examples
# \donttest{
library(PhenotypeR)
cdm <- mockPhenotypeR()
result <- cohortDiagnostics(cdm$my_cohort,
match = TRUE)
#> • Starting Cohort Diagnostics
#> → Getting cohort attrition
#> → Getting cohort count
#> ℹ summarising data
#> ℹ summarising cohort cohort_1
#> ℹ summarising cohort cohort_2
#> ✔ summariseCharacteristics finished!
#> → Getting cohort overlap
#> → Getting cohort timing
#> ℹ The following estimates will be computed:
#> • days_between_cohort_entries: median, q25, q75, min, max, density
#> ! Table is collected to memory as not all requested estimates are supported on
#> the database side
#> → Start summary of data, at 2025-07-04 09:54:31.839686
#> ✔ Summary finished, at 2025-07-04 09:54:31.970216
#> → Creating matching cohorts
#> → Sampling cohort `my_cohort`
#> Returning entry cohort as the size of the cohorts to be sampled is equal or
#> smaller than `n`.
#> • Generating an age and sex matched cohort for cohort_1
#> Starting matching
#> ℹ Creating copy of target cohort.
#> • 1 cohort to be matched.
#> ℹ Creating controls cohorts.
#> ℹ Excluding cases from controls
#> • Matching by gender_concept_id and year_of_birth
#> • Removing controls that were not in observation at index date
#> • Excluding target records whose pair is not in observation
#> • Adjusting ratio
#> Binding cohorts
#> ✔ Done
#> → Sampling cohort `my_cohort`
#> Returning entry cohort as the size of the cohorts to be sampled is equal or
#> smaller than `n`.
#> • Generating an age and sex matched cohort for cohort_2
#> Starting matching
#> ℹ Creating copy of target cohort.
#> • 1 cohort to be matched.
#> ℹ Creating controls cohorts.
#> ℹ Excluding cases from controls
#> • Matching by gender_concept_id and year_of_birth
#> • Removing controls that were not in observation at index date
#> • Excluding target records whose pair is not in observation
#> • Adjusting ratio
#> Binding cohorts
#> ✔ Done
#> → Getting cohorts and indexes
#> → Summarising cohort characteristics
#> ℹ adding demographics columns
#> ℹ adding tableIntersectCount 1/1
#> window names casted to snake_case:
#> • `-365 to -1` -> `365_to_1`
#> ℹ summarising data
#> ℹ summarising cohort cohort_1
#> ℹ summarising cohort cohort_2
#> ℹ summarising cohort cohort_1_sampled
#> ℹ summarising cohort cohort_1_matched
#> ℹ summarising cohort cohort_2_sampled
#> ℹ summarising cohort cohort_2_matched
#> ✔ summariseCharacteristics finished!
#> → Calculating age density
#> ℹ The following estimates will be computed:
#> • age: density
#> → Start summary of data, at 2025-07-04 09:54:56.235118
#> ✔ Summary finished, at 2025-07-04 09:54:56.564309
#> → Run large scale characteristics (including source and standard codes)
#> ℹ Summarising large scale characteristics
#> - getting characteristics from table condition_occurrence (1 of 6)
#> - getting characteristics from table visit_occurrence (2 of 6)
#> - getting characteristics from table measurement (3 of 6)
#> - getting characteristics from table procedure_occurrence (4 of 6)
#> - getting characteristics from table observation (5 of 6)
#> - getting characteristics from table drug_exposure (6 of 6)
#> Formatting result
#> ✔ Summarising large scale characteristics
#> → Run large scale characteristics (including only standard codes)
#> ℹ Summarising large scale characteristics
#> - getting characteristics from table condition_occurrence (1 of 6)
#> - getting characteristics from table visit_occurrence (2 of 6)
#> - getting characteristics from table measurement (3 of 6)
#> - getting characteristics from table procedure_occurrence (4 of 6)
#> - getting characteristics from table observation (5 of 6)
#> - getting characteristics from table drug_exposure (6 of 6)
#> Formatting result
#> ✔ Summarising large scale characteristics
CDMConnector::cdmDisconnect(cdm = cdm)
# }