Summarise missing data in omop tables
Usage
summariseMissingData(
cdm,
omopTableName,
col = NULL,
sex = FALSE,
interval = "overall",
ageGroup = NULL,
sample = 1e+05,
dateRange = NULL,
year = lifecycle::deprecated()
)Arguments
- cdm
A
cdm_referenceobject. Use CDMConnector to create a reference to a database or omock to create a reference to synthetic data.- omopTableName
A character vector of the names of the tables to summarise in the cdm object. Run
clinicalTables()to check the available options.- col
A character vector of column names to check for missing values. If
NULL, all columns in the specified tables are checked. Default isNULL.- sex
Logical; whether to stratify results by sex (
TRUE) or not (FALSE).- interval
Time interval to stratify by. It can either be "years", "quarters", "months" or "overall".
- ageGroup
A list of age groups to stratify the results by. Each element represents a specific age range. You can give them specific names, e.g.
ageGroup = list(children = c(0, 17), adult = c(18, Inf)).- sample
Either an integer or a character string.
If an integer (n > 0), the function will first sample
ndistinctperson_ids from thepersontable and then subset the input tables to those subjects.If a character string, it must be the name of a cohort in the
cdm; in this case, the input tables are subset to subjects (subject_id) belonging to that cohort.Use
NULLto disable subsetting (default value).
- dateRange
A vector of two dates defining the desired study period. Only the
start_datecolumn of the OMOP table is checked to ensure it falls within this range. IfdateRangeisNULL, no restriction is applied.- year
deprecated
Examples
# \donttest{
library(OmopSketch)
library(omock)
cdm <- mockCdmFromDataset(datasetName = "GiBleed", source = "duckdb")
#> ℹ Reading GiBleed tables.
#> ℹ Adding drug_strength table.
#> ℹ Creating local <cdm_reference> object.
#> ℹ Inserting <cdm_reference> into duckdb.
result <- summariseMissingData(
cdm = cdm,
omopTableName = c("condition_occurrence", "visit_occurrence"),
sample = 10000
)
#> The person table has ≤ 10000 subjects; skipping sampling of the CDM.
tableMissingData(result = result)
Summary of missingness in condition_occurrence, visit_occurrence tables
visit_occurrence
condition_occurrence
cdmDisconnect(cdm = cdm)
# }
