Skip to contents

This comprises all the diagnostics that are being offered in this package, this includes:

* A diagnostics on the database via `databaseDiagnostics`. * A diagnostics on the cohort_codelist attribute of the cohort via `codelistDiagnostics`. * A diagnostics on the cohort via `cohortDiagnostics`. * A diagnostics on the population via `populationDiagnostics`. * A diagnostics on the matched cohort via `matchedDiagnostics`.

Usage

phenotypeDiagnostics(
  cohort,
  databaseDiagnostics = TRUE,
  codelistDiagnostics = TRUE,
  cohortDiagnostics = TRUE,
  populationDiagnostics = TRUE,
  populationSample = 1e+06,
  matchedDiagnostics = TRUE,
  matchedSample = 1000
)

Arguments

cohort

Cohort table in a cdm reference

databaseDiagnostics

If TRUE, database diagnostics will be run.

codelistDiagnostics

If TRUE, codelist diagnostics will be run.

cohortDiagnostics

If TRUE, cohort diagnostics will be run.

populationDiagnostics

If TRUE, population diagnostics will be run.

populationSample

Number of people from the cdm to sample. If NULL no sampling will be performed

matchedDiagnostics

If TRUE, cohort to population diagnostics will be run.

matchedSample

The number of people to take a random sample for matching. If NULL, no sampling will be performed.

Value

A summarised result

Examples

# \donttest{
  cdm_local <- omock::mockCdmReference() |>
    omock::mockPerson(nPerson = 100) |>
    omock::mockObservationPeriod() |>
    omock::mockConditionOccurrence() |>
    omock::mockDrugExposure() |>
    omock::mockObservation() |>
    omock::mockMeasurement() |>
    omock::mockCohort(name = "my_cohort",
                      numberCohorts = 2)
  cdm_local$visit_occurrence <- dplyr::tibble(
    person_id = 1L,
    visit_occurrence_id = 1L,
    visit_concept_id = 1L,
    visit_start_date = as.Date("2000-01-01"),
    visit_end_date = as.Date("2000-01-01"),
    visit_type_concept_id = 1L)
  cdm_local$procedure_occurrence <- dplyr::tibble(
    person_id = 1L,
    procedure_occurrence_id = 1L,
    procedure_concept_id = 1L,
    procedure_date = as.Date("2000-01-01"),
    procedure_type_concept_id = 1L)

  db <- DBI::dbConnect(duckdb::duckdb())
  cdm <- CDMConnector::copyCdmTo(con = db,
                                 cdm = cdm_local,
                                 schema ="main",
                                 overwrite = TRUE)
  phenotypeDiagnostics(cdm$my_cohort)
#> 
#> 
#>  Getting codelists from cohorts
#> Warning: No codelists found for the specified cohorts
#> Warning: No codelists found for the specified cohorts
#> Warning: Empty cohort_codelist attribute for cohort
#>  Returning an empty summarised result
#> 
#>  Getting cohort summary
#>  adding demographics columns
#>  adding tableIntersectCount 1/1
#>  summarising data
#>  summariseCharacteristics finished!
#>  Getting age density
#>  Getting cohort attrition
#>  Getting cohort overlap
#>  Getting cohort timing
#>  The following estimates will be computed:
#>  days_between_cohort_entries: density
#> → Start summary of data, at 2024-11-21 11:01:21.093467
#>  Summary finished, at 2024-11-21 11:01:21.167939
#> 
#>  Creating denominator for incidence and prevalence
#>  Sampling person table to 1e+06
#>  Creating denominator cohorts
#>  Cohorts created in 0 min and 7 sec
#>  Estimating incidence
#> Getting incidence for analysis 1 of 24
#> Getting incidence for analysis 2 of 24
#> Getting incidence for analysis 3 of 24
#> Getting incidence for analysis 4 of 24
#> Getting incidence for analysis 5 of 24
#> Getting incidence for analysis 6 of 24
#> Getting incidence for analysis 7 of 24
#> Getting incidence for analysis 8 of 24
#> Getting incidence for analysis 9 of 24
#> Getting incidence for analysis 10 of 24
#> Getting incidence for analysis 11 of 24
#> Getting incidence for analysis 12 of 24
#> Getting incidence for analysis 13 of 24
#> Getting incidence for analysis 14 of 24
#> Getting incidence for analysis 15 of 24
#> Getting incidence for analysis 16 of 24
#> Getting incidence for analysis 17 of 24
#> Getting incidence for analysis 18 of 24
#> Getting incidence for analysis 19 of 24
#> Getting incidence for analysis 20 of 24
#> Getting incidence for analysis 21 of 24
#> Getting incidence for analysis 22 of 24
#> Getting incidence for analysis 23 of 24
#> Getting incidence for analysis 24 of 24
#> Overall time taken: 0 mins and 18 secs
#>  Estimating prevalence
#> Getting prevalence for analysis 1 of 12
#> Getting prevalence for analysis 2 of 12
#> Getting prevalence for analysis 3 of 12
#> Getting prevalence for analysis 4 of 12
#> Getting prevalence for analysis 5 of 12
#> Getting prevalence for analysis 6 of 12
#> Getting prevalence for analysis 7 of 12
#> Getting prevalence for analysis 8 of 12
#> Getting prevalence for analysis 9 of 12
#> Getting prevalence for analysis 10 of 12
#> Getting prevalence for analysis 11 of 12
#> Getting prevalence for analysis 12 of 12
#> Time taken: 0 mins and 7 secs
#> 
#>  Taking 1000 person sample of cohorts
#>  Generating a age and sex matched cohorts
#> Starting matching
#> Warning: Multiple records per person detected. The matchCohorts() function is designed
#> to operate under the assumption that there is only one record per person within
#> each cohort. If this assumption is not met, each record will be treated
#> independently. As a result, the same individual may be matched multiple times,
#> leading to inconsistent and potentially misleading results.
#>  Creating copy of target cohort.
#>  2 cohorts to be matched.
#>  Creating controls cohorts.
#>  Excluding cases from controls
#>  Matching by gender_concept_id and year_of_birth
#>  Removing controls that were not in observation at index date
#>  Excluding target records whose pair is not in observation
#>  Adjusting ratio
#> Binding cohorts
#>  Done
#>  adding demographics columns
#>  adding tableIntersectCount 1/1
#>  summarising data
#>  summariseCharacteristics finished!
#>  Running large scale characterisation
#>  Summarising large scale characteristics 
#> 
#>  - getting characteristics from table condition_occurrence (1 of 6)
#>  - getting characteristics from table visit_occurrence (2 of 6)
#>  - getting characteristics from table measurement (3 of 6)
#>  - getting characteristics from table procedure_occurrence (4 of 6)
#>  - getting characteristics from table observation (5 of 6)
#>  - getting characteristics from table drug_exposure (6 of 6)
#> 
#> # A tibble: 21,598 × 13
#>    result_id cdm_name      group_name group_level strata_name strata_level
#>        <int> <chr>         <chr>      <chr>       <chr>       <chr>       
#>  1         1 mock database overall    overall     overall     overall     
#>  2         1 mock database overall    overall     overall     overall     
#>  3         1 mock database overall    overall     overall     overall     
#>  4         1 mock database overall    overall     overall     overall     
#>  5         1 mock database overall    overall     overall     overall     
#>  6         1 mock database overall    overall     overall     overall     
#>  7         1 mock database overall    overall     overall     overall     
#>  8         1 mock database overall    overall     overall     overall     
#>  9         1 mock database overall    overall     overall     overall     
#> 10         1 mock database overall    overall     overall     overall     
#> # ℹ 21,588 more rows
#> # ℹ 7 more variables: variable_name <chr>, variable_level <chr>,
#> #   estimate_name <chr>, estimate_type <chr>, estimate_value <chr>,
#> #   additional_name <chr>, additional_level <chr>
  CDMConnector::cdm_disconnect(cdm)
# }