Step 2. Obtain the sequence ratios
Source:vignettes/a03_Summarise_sequence_ratios.Rmd
a03_Summarise_sequence_ratios.Rmd
Introduction
In this vignette we will explore the functionality and arguments of
summariseSequenceRatios()
function, which is used to
generate the sequence ratios of the SSA. As this function uses the
output of generateSequenceCohortSet()
function (explained
in detail in the vignette: Step 1. Generate a sequence
cohort), we will pick up the explanation from where we left off
in the previous vignette.
Recall that in the previous vignette: Step 1. Generate a sequence
cohort, we’ve generated cdm$aspirin
and
cdm$acetaminophen
before and using them we could generate
cdm$intersect
like so:
# Generate a sequence cohort
cdm <- generateSequenceCohortSet(
cdm = cdm,
indexTable = "aspirin",
markerTable = "acetaminophen",
name = "intersect",
combinationWindow = c(0,Inf))
Obtain sequence ratios
One can obtain the crude and adjusted sequence ratios (with its
corresponding confidence intervals) using
summariseSequenceRatios()
function:
summariseSequenceRatios(
cohort = cdm$intersect
) |>
dplyr::glimpse()
#> Rows: 10
#> Columns: 13
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
#> $ cdm_name <chr> "Synthea synthetic health database", "Synthea synthet…
#> $ group_name <chr> "index_cohort_name &&& marker_cohort_name", "index_co…
#> $ group_level <chr> "1191_aspirin &&& 161_acetaminophen", "1191_aspirin &…
#> $ strata_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ strata_level <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ variable_name <chr> "crude", "adjusted", "crude", "crude", "adjusted", "a…
#> $ variable_level <chr> "sequence_ratio", "sequence_ratio", "sequence_ratio",…
#> $ estimate_name <chr> "point_estimate", "point_estimate", "lower_CI", "uppe…
#> $ estimate_type <chr> "numeric", "numeric", "numeric", "numeric", "numeric"…
#> $ estimate_value <chr> "1.8108504398827", "1218.23655491138", "1.64970963817…
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
The obtained output has a summarised result format. In the later vignette (Step 3. Visualise results) we will explore how to visualise the results in a more intuitive way.
Modify the cohort based on cohort_definition_id
This parameter is used to subset the cohort table inputted to the
summariseSequenceRatios()
. Imagine the user only wants to
include cohort_definition_id
from cdm$intersect
in the
summariseSequenceRatios()
, then one could do the
following:
summariseSequenceRatios(cohort = cdm$intersect,
cohortId = 1) |>
dplyr::glimpse()
#> Rows: 10
#> Columns: 13
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
#> $ cdm_name <chr> "Synthea synthetic health database", "Synthea synthet…
#> $ group_name <chr> "index_cohort_name &&& marker_cohort_name", "index_co…
#> $ group_level <chr> "1191_aspirin &&& 161_acetaminophen", "1191_aspirin &…
#> $ strata_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ strata_level <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ variable_name <chr> "crude", "adjusted", "crude", "crude", "adjusted", "a…
#> $ variable_level <chr> "sequence_ratio", "sequence_ratio", "sequence_ratio",…
#> $ estimate_name <chr> "point_estimate", "point_estimate", "lower_CI", "uppe…
#> $ estimate_type <chr> "numeric", "numeric", "numeric", "numeric", "numeric"…
#> $ estimate_value <chr> "1.8108504398827", "1218.23655491138", "1.64970963817…
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
Of course in this case this does nothing because every entry in
cdm$intersect
has cohort_definition_id
.
Modify confidenceInterval
By default, the summariseSequenceRatios()
function will
use 95% (two-sided) confidence interval. If another confidence interval
is desired, for example 99% confidence interval, one can use the
confidenceInterval
argument:
summariseSequenceRatios(
cohort = cdm$intersect,
confidenceInterval = 99) |>
dplyr::glimpse()
#> Rows: 10
#> Columns: 13
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
#> $ cdm_name <chr> "Synthea synthetic health database", "Synthea synthet…
#> $ group_name <chr> "index_cohort_name &&& marker_cohort_name", "index_co…
#> $ group_level <chr> "1191_aspirin &&& 161_acetaminophen", "1191_aspirin &…
#> $ strata_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ strata_level <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ variable_name <chr> "crude", "adjusted", "crude", "crude", "adjusted", "a…
#> $ variable_level <chr> "sequence_ratio", "sequence_ratio", "sequence_ratio",…
#> $ estimate_name <chr> "point_estimate", "point_estimate", "lower_CI", "uppe…
#> $ estimate_type <chr> "numeric", "numeric", "numeric", "numeric", "numeric"…
#> $ estimate_value <chr> "1.8108504398827", "1218.23655491138", "1.60240541369…
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
Modify movingAverageRestriction
The idea of moving average restriction is necessary only for the null
sequence ratio calculation, please refer to Lai et al. (2017) for more
details on this parameter (parameter d when calculating P in page 578).
Following Tsiropoulos et al. (2009), by default, the argument
movingAverageRestriction
is set to be
(
months).
Modify minCellCount
By default, the minimum number of events to reported is 5, below which results will be obscured. If 0, all results will be reported and the user could do this via:
summariseSequenceRatios(cohort = cdm$intersect,
minCellCount = 0) |>
dplyr::glimpse()
#> Rows: 10
#> Columns: 13
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
#> $ cdm_name <chr> "Synthea synthetic health database", "Synthea synthet…
#> $ group_name <chr> "index_cohort_name &&& marker_cohort_name", "index_co…
#> $ group_level <chr> "1191_aspirin &&& 161_acetaminophen", "1191_aspirin &…
#> $ strata_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ strata_level <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ variable_name <chr> "crude", "adjusted", "crude", "crude", "adjusted", "a…
#> $ variable_level <chr> "sequence_ratio", "sequence_ratio", "sequence_ratio",…
#> $ estimate_name <chr> "point_estimate", "point_estimate", "lower_CI", "uppe…
#> $ estimate_type <chr> "numeric", "numeric", "numeric", "numeric", "numeric"…
#> $ estimate_value <chr> "1.8108504398827", "1218.23655491138", "1.64970963817…
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
CDMConnector::cdmDisconnect(cdm = cdm)