Phenotype a cohort
phenotypeDiagnostics.Rd
This comprises all the diagnostics that are being offered in this package, this includes:
* A diagnostics on the database via `databaseDiagnostics`. * A diagnostics on the cohort_codelist attribute of the cohort via `codelistDiagnostics`. * A diagnostics on the cohort via `cohortDiagnostics`. * A diagnostics on the population via `populationDiagnostics`. * A diagnostics on the matched cohort via `matchedDiagnostics`.
Usage
phenotypeDiagnostics(
cohort,
databaseDiagnostics = TRUE,
codelistDiagnostics = TRUE,
cohortDiagnostics = TRUE,
populationDiagnostics = TRUE,
populationSample = 1e+06,
matchedDiagnostics = TRUE,
matchedSample = 1000
)
Arguments
- cohort
Cohort table in a cdm reference
- databaseDiagnostics
If TRUE, database diagnostics will be run.
- codelistDiagnostics
If TRUE, codelist diagnostics will be run.
- cohortDiagnostics
If TRUE, cohort diagnostics will be run.
- populationDiagnostics
If TRUE, population diagnostics will be run.
- populationSample
Number of people from the cdm to sample. If NULL no sampling will be performed
- matchedDiagnostics
If TRUE, cohort to population diagnostics will be run.
- matchedSample
The number of people to take a random sample for matching. If NULL, no sampling will be performed.
Examples
# \donttest{
cdm_local <- omock::mockCdmReference() |>
omock::mockPerson(nPerson = 100) |>
omock::mockObservationPeriod() |>
omock::mockConditionOccurrence() |>
omock::mockDrugExposure() |>
omock::mockObservation() |>
omock::mockMeasurement() |>
omock::mockCohort(name = "my_cohort",
numberCohorts = 2)
cdm_local$visit_occurrence <- dplyr::tibble(
person_id = 1L,
visit_occurrence_id = 1L,
visit_concept_id = 1L,
visit_start_date = as.Date("2000-01-01"),
visit_end_date = as.Date("2000-01-01"),
visit_type_concept_id = 1L)
cdm_local$procedure_occurrence <- dplyr::tibble(
person_id = 1L,
procedure_occurrence_id = 1L,
procedure_concept_id = 1L,
procedure_date = as.Date("2000-01-01"),
procedure_type_concept_id = 1L)
db <- DBI::dbConnect(duckdb::duckdb())
cdm <- CDMConnector::copyCdmTo(con = db,
cdm = cdm_local,
schema ="main",
overwrite = TRUE)
phenotypeDiagnostics(cdm$my_cohort)
#>
#>
#> • Getting codelists from cohorts
#> Warning: No codelists found for the specified cohorts
#> Warning: No codelists found for the specified cohorts
#> Warning: Empty cohort_codelist attribute for cohort
#> ℹ Returning an empty summarised result
#>
#> • Getting cohort summary
#> ℹ adding demographics columns
#> ℹ adding tableIntersectCount 1/1
#> ℹ summarising data
#> ✔ summariseCharacteristics finished!
#> • Getting age density
#> • Getting cohort attrition
#> • Getting cohort overlap
#> • Getting cohort timing
#> ℹ The following estimates will be computed:
#> • days_between_cohort_entries: density
#> → Start summary of data, at 2024-11-21 11:01:21.093467
#> ✔ Summary finished, at 2024-11-21 11:01:21.167939
#>
#> • Creating denominator for incidence and prevalence
#> • Sampling person table to 1e+06
#> ℹ Creating denominator cohorts
#> ✔ Cohorts created in 0 min and 7 sec
#> • Estimating incidence
#> Getting incidence for analysis 1 of 24
#> Getting incidence for analysis 2 of 24
#> Getting incidence for analysis 3 of 24
#> Getting incidence for analysis 4 of 24
#> Getting incidence for analysis 5 of 24
#> Getting incidence for analysis 6 of 24
#> Getting incidence for analysis 7 of 24
#> Getting incidence for analysis 8 of 24
#> Getting incidence for analysis 9 of 24
#> Getting incidence for analysis 10 of 24
#> Getting incidence for analysis 11 of 24
#> Getting incidence for analysis 12 of 24
#> Getting incidence for analysis 13 of 24
#> Getting incidence for analysis 14 of 24
#> Getting incidence for analysis 15 of 24
#> Getting incidence for analysis 16 of 24
#> Getting incidence for analysis 17 of 24
#> Getting incidence for analysis 18 of 24
#> Getting incidence for analysis 19 of 24
#> Getting incidence for analysis 20 of 24
#> Getting incidence for analysis 21 of 24
#> Getting incidence for analysis 22 of 24
#> Getting incidence for analysis 23 of 24
#> Getting incidence for analysis 24 of 24
#> Overall time taken: 0 mins and 18 secs
#> • Estimating prevalence
#> Getting prevalence for analysis 1 of 12
#> Getting prevalence for analysis 2 of 12
#> Getting prevalence for analysis 3 of 12
#> Getting prevalence for analysis 4 of 12
#> Getting prevalence for analysis 5 of 12
#> Getting prevalence for analysis 6 of 12
#> Getting prevalence for analysis 7 of 12
#> Getting prevalence for analysis 8 of 12
#> Getting prevalence for analysis 9 of 12
#> Getting prevalence for analysis 10 of 12
#> Getting prevalence for analysis 11 of 12
#> Getting prevalence for analysis 12 of 12
#> Time taken: 0 mins and 7 secs
#>
#> • Taking 1000 person sample of cohorts
#> • Generating a age and sex matched cohorts
#> Starting matching
#> Warning: Multiple records per person detected. The matchCohorts() function is designed
#> to operate under the assumption that there is only one record per person within
#> each cohort. If this assumption is not met, each record will be treated
#> independently. As a result, the same individual may be matched multiple times,
#> leading to inconsistent and potentially misleading results.
#> ℹ Creating copy of target cohort.
#> • 2 cohorts to be matched.
#> ℹ Creating controls cohorts.
#> ℹ Excluding cases from controls
#> • Matching by gender_concept_id and year_of_birth
#> • Removing controls that were not in observation at index date
#> • Excluding target records whose pair is not in observation
#> • Adjusting ratio
#> Binding cohorts
#> ✔ Done
#> ℹ adding demographics columns
#> ℹ adding tableIntersectCount 1/1
#> ℹ summarising data
#> ✔ summariseCharacteristics finished!
#> • Running large scale characterisation
#> ℹ Summarising large scale characteristics
#>
#> - getting characteristics from table condition_occurrence (1 of 6)
#> - getting characteristics from table visit_occurrence (2 of 6)
#> - getting characteristics from table measurement (3 of 6)
#> - getting characteristics from table procedure_occurrence (4 of 6)
#> - getting characteristics from table observation (5 of 6)
#> - getting characteristics from table drug_exposure (6 of 6)
#>
#> # A tibble: 21,598 × 13
#> result_id cdm_name group_name group_level strata_name strata_level
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 1 mock database overall overall overall overall
#> 2 1 mock database overall overall overall overall
#> 3 1 mock database overall overall overall overall
#> 4 1 mock database overall overall overall overall
#> 5 1 mock database overall overall overall overall
#> 6 1 mock database overall overall overall overall
#> 7 1 mock database overall overall overall overall
#> 8 1 mock database overall overall overall overall
#> 9 1 mock database overall overall overall overall
#> 10 1 mock database overall overall overall overall
#> # ℹ 21,588 more rows
#> # ℹ 7 more variables: variable_name <chr>, variable_level <chr>,
#> # estimate_name <chr>, estimate_type <chr>, estimate_value <chr>,
#> # additional_name <chr>, additional_level <chr>
CDMConnector::cdm_disconnect(cdm)
# }