Introduction
This document describes the data model of the output of the
SelfControlledCaseSeries (SCCS) package, generated by the
exportToCsv()
function. This vignette assumes you are
already familiar with the SelfControlledCaseSeries
package,
and have read all other vignettes.
Exposures, covariates of interest, and controls
As described in the ‘Single studies using the
SelfControlledCaseSeries package’ vignette, eras are
cohorts or drug eras extracted from the database.
Covariates can either be splines, for example
representing age or season, or era covariates, derived from eras. When
defining covariates using the createEraCovariateSettings()
function we can either use verbatim era IDs (e.g. cohort IDs), or we can
reference a variable (typically called ‘exposureId’). When defining
exposures using the exposure()
function,
we can define different era IDs to be used for this variable, thereby
using the same analysis settings for different exposures and outcomes.
For each exposure we can set the trueEffectSize
if known.
Any exposure with known true effect size is considered a
control
, and will be used for empirical calibration. Some
of our covariates can be marked as covariates of
interest by setting exposureOfInterest = TRUE
when
calling createEraCovariateSettings()
. This is especially
relevant for the results model, since these covariates will be reported
in the sccs_result
table.
Exposures-outcome-sets, analysis IDs and models
Using the createExposuresOutcome()
function we can
define an outcome with one or more exposures, since an SCCS model can
have multiple exposures (e.g. we could have separte exposures for the
first and second dose of a vaccine). With the
createSccsAnalysis()
function we can create a set of
settings for analysis describing which data to extract from the
database, how to transform that data including which covariates to
construct, and how to fit the SCCS model. Each analysis setting has a
unique analysis ID. Each combination of an exposures-outcome-set and an
analysis setting will correspond to one SCCS model. A model can have
multiple covariates, and each covariates can be based on multiple
eras.
Fields with minimum values
Some fields contain patient counts or fractions that can easily be
converted to patient counts. To prevent identifiability, these fields
are subject to a minimum value. When the value falls below this minimum,
it is replaced with the negative value of the minimum. For example, if
the minimum subject count is 5, and the actual count is 2, the value
stored in the data model will be -5, which could be represented as
‘<5’ to the user. Note that the value 0 is permissible, as it
identifies no persons. These fields are identified below as having Min.
count = ‘Yes’.
Tables
In this section you will find the list of tables and their
fields.
Table sccs_age_spanning
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
age_month |
int |
Yes |
No |
Age in months since birth. |
cover_before_after_subjects |
int |
No |
Yes |
Number of subjects whose observation period covers this
month as well as the one before and after. |
Table sccs_analysis
analysis_id |
int |
Yes |
No |
A unique identifier for an analysis. |
description |
varchar |
No |
No |
A description for an analysis, e.g. ‘Correcting for age
and season’. |
definition |
varchar |
No |
No |
A JSON object specifying the analysis. |
Table sccs_attrition
sequence_number |
int |
Yes |
No |
The place in the sequence of steps defining the final
analysis cohort. 1 indicates the original exposed population without any
inclusion criteria. |
description |
varchar |
No |
No |
A description of the last restriction, e.g. “Removing
persons with the outcome prior”. |
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
covariate_id |
int |
Yes |
No |
A foreign key referencing the sccs_covariate table. The
identifier for the covariate of interest. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
outcome_subjects |
int |
No |
Yes |
The number of subjects with at least one outcome. |
outcome_events |
int |
No |
Yes |
The number of outcome events. |
outcome_observation_periods |
int |
No |
Yes |
The number of observation periods containing at least
one outcome. |
observed_days |
bigint |
No |
Yes |
The number of days subjects were observed. |
Table sccs_calendar_time_spanning
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
calendar_year |
int |
Yes |
No |
Calendar year (e.g. 2022) |
calendar_month |
int |
Yes |
No |
Calendar month (e.g. 1 is January). |
cover_before_after_subjects |
int |
No |
Yes |
Number of subjects whose observation period covers this
month as well as the one before and after. |
Table sccs_censor_model
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
parameter_id |
int |
Yes |
No |
The parameter number in the censor model (starting at
1). |
parameter_value |
float |
No |
No |
The fitted parameter value. |
model_type |
varchar |
No |
No |
The type of censor model. Can be ‘Weibull-Age’.
‘Weibull-Interval’, ‘Gamma-Age’, or ‘Gamma-Interval’. |
Table sccs_covariate
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
covariate_id |
int |
Yes |
No |
A unique identifier for a covariate. |
covariate_name |
varchar |
No |
No |
A description for the covariate. |
era_id |
int |
No |
No |
A foreign key referencing the sccs_era table. |
covariate_analysis_id |
int |
No |
No |
A foreign key referencing the sccs_covariate_analysis
table. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
Table sccs_covariate_analysis
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
covariate_analysis_id |
int |
Yes |
No |
A unique identifier for a covariate analysis. |
covariate_analysis_name |
varchar |
No |
No |
A name for a covariate analysis,
e.g. ‘Pre-exposure’. |
variable_of_interest |
int |
No |
No |
Is the variable of interest (1 = yes, 0 = no). |
Table sccs_covariate_result
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
covariate_id |
int |
Yes |
No |
The identifier for the covariate. |
rr |
float |
No |
No |
The estimated relative risk (i.e. the incidence rate
ratio). |
ci_95_lb |
float |
No |
No |
The lower bound of the 95% confidence interval of the
relative risk. |
ci_95_ub |
float |
No |
No |
The upper bound of the 95% confidence interval of the
relative risk. |
Table sccs_diagnostics_summary
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
covariate_id |
int |
Yes |
No |
The identifier for the covariate of interest. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
mdrr |
float |
No |
No |
The minimum detectable relative risk. |
ease |
float |
No |
No |
The expected absolute systematic error. |
time_trend_p |
float |
No |
No |
The family-wise p for whether the monthly outcome rate
is equal to the mean. |
pre_exposure_p |
float |
No |
No |
One-sided p-value for whether the rate before expore is
higher than after, against the null of no difference. |
mdrr_diagnostic |
varchar(20) |
No |
No |
Pass / warning / fail / not evaluated classification of
the MDRR diagnostic. |
ease_diagnostic |
varchar(20) |
No |
No |
Pass / warning / fail / not evaluated classification of
the EASE diagnostic. |
time_trend_diagnostic |
varchar(20) |
No |
No |
Pass / warning / fail / not evaluated classification of
the time trend (unstalbe months) diagnostic. |
pre_exposure_diagnostic |
varchar(20) |
No |
No |
Pass / warning / fail / not evaluated classification of
the time trend (unstalbe months) diagnostic. |
unblind |
int |
No |
No |
Is unblinding the result recommended? (1 = yes, 0 =
no) |
Table sccs_era
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
analysis_id |
int |
Yes |
No |
A unique identifier for an analysis. |
era_type |
varchar |
Yes |
No |
The type of era (e.g. ‘rx’ for drugs). |
era_id |
int |
Yes |
No |
A unique identifier, corresponding to the ID in the
source table (e.g. cohort_definition_id in a cohort table, or the
drug_concept_id in the drug_era table). |
era_name |
varchar |
No |
No |
A name for the era. Is NULL for eras derived from
cohorts. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
Table sccs_event_dep_observation
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
months_to_end |
int |
Yes |
No |
Number of months until observation end. |
censored |
int |
Yes |
No |
Whether the observation is censored (meaning, not equal
to the end of database time). (1 = censored, 0 = not censored). |
outcomes |
int |
No |
Yes |
Number of outcomes observed during the month. |
Table sccs_exposure
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
era_id |
int |
Yes |
No |
A foreign key referencing the sccs_era table. |
true_effect_size |
float |
No |
No |
If known, the true effect size. For negatitive controls
this equals 1. |
Table sccs_exposures_outcome_set
exposures_outcome_set_id |
int |
Yes |
No |
A unique identifier for a set of exposures and an
outcome. |
outcome_id |
int |
No |
No |
A cohort ID. |
Table sccs_likelihood_profile
log_rr |
float |
Yes |
No |
The log of the relative risk where the likelihood is
sampled. |
log_likelihood |
float |
No |
No |
The normalized log likelihood. |
covariate_id |
int |
Yes |
No |
The identifier for the covariate of interest. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
Table sccs_result
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
covariate_id |
int |
Yes |
No |
A foreign key referencing the sccs_covariate table. The
identifier for the covariate of interest. |
rr |
float |
No |
No |
The estimated relative risk (i.e. the incidence rate
ratio). |
ci_95_lb |
float |
No |
No |
The lower bound of the 95% confidence interval of the
relative risk. |
ci_95_ub |
float |
No |
No |
The upper bound of the 95% confidence interval of the
relative risk. |
p |
float |
No |
No |
The two-sided p-value considering the null hypothesis
of no effect. |
outcome_subjects |
int |
No |
Yes |
The number of subjects with at least one outcome. |
outcome_events |
int |
No |
Yes |
The number of outcome events. |
outcome_observation_periods |
int |
No |
Yes |
The number of observation periods containing at least
one outcome. |
covariate_subjects |
int |
No |
Yes |
The number of subjects having the covariate. |
covariate_days |
int |
No |
Yes |
The total covariate time in days. |
covariate_eras |
int |
No |
Yes |
The number of continuous eras of the covariate. |
covariate_outcomes |
int |
No |
Yes |
The number of outcomes observed during the covariate
time. |
observed_days |
bigint |
No |
Yes |
The number of days subjects were observed. |
log_rr |
float |
No |
No |
The log of the relative risk. |
se_log_rr |
float |
No |
No |
The standard error of the log of the relative
risk. |
llr |
float |
No |
No |
The log of the likelihood ratio (of the MLE vs the null
hypothesis of no effect). |
calibrated_rr |
float |
No |
No |
The calibrated relative risk. |
calibrated_ci_95_lb |
float |
No |
No |
The lower bound of the calibrated 95% confidence
interval of the relative risk. |
calibrated_ci_95_ub |
float |
No |
No |
The upper bound of the calibrated 95% confidence
interval of the relative risk. |
calibrated_p |
float |
No |
No |
The calibrated two-sided p-value. |
calibrated_log_rr |
float |
No |
No |
The log of the calibrated relative risk. |
calibrated_se_log_rr |
float |
No |
No |
The standard error of the log of the calibrated
relative risk. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
Table sccs_spline
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
spline_type |
varchar |
Yes |
No |
Either ‘age’, ‘season’, or ‘calendar time’. |
knot_month |
float |
Yes |
No |
Location of the knot. For age, the month since birth.
For season, the month of the year. For calendar time, the month since
1-1-1970. |
rr |
float |
No |
No |
The estimated relative risk (i.e. the incidence rate
ratio). |
Table sccs_time_to_event
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
era_id |
int |
Yes |
No |
A foreign key referencing the sccs_era table. The
identifier for the era of interest. |
week |
int |
Yes |
No |
The number of the week relative to exposure. Week 0
starts on the day of exposure initiation. |
observed_subjects |
int |
No |
Yes |
The numer of people observed during the week. |
outcomes |
int |
No |
Yes |
The number of outcomes observed durig the week. |
Table sccs_time_trend
analysis_id |
int |
Yes |
No |
A foreign key referencing the sccs_analysis table. |
exposures_outcome_set_id |
int |
Yes |
No |
A foreign key referencing the
sccs_exposures_outcome_set table. |
database_id |
varchar |
Yes |
No |
Foreign key referencing the database. |
calendar_year |
int |
Yes |
No |
The calendar year (e.g. 2022). |
calendar_month |
int |
Yes |
No |
The calendar month (e.g. 1 for January). |
observed_subjects |
int |
No |
Yes |
Number of people observed during the month. |
outcome_rate |
float |
No |
Yes |
Number of outcomes divided by the number of
subjects. |
adjusted_rate |
float |
No |
Yes |
The outcome rate, adjusted for age, season, or calendar
time, as specified in the analysis. |
stable |
int |
No |
No |
Does the adjusted rate not deviate significantly from
the mean? (1 = stable, 0 = unstable) |
p |
float |
No |
No |
The two-sided p-value against the null hypothesis that
the rate is equal to the mean. |