
Summarise Database Characteristics for OMOP CDM
Source:R/databaseCharacteristics.R
databaseCharacteristics.RdSummarise Database Characteristics for OMOP CDM
Usage
databaseCharacteristics(
cdm,
omopTableName = c("visit_occurrence", "visit_detail", "condition_occurrence",
"drug_exposure", "procedure_occurrence", "device_exposure", "measurement",
"observation", "death"),
sample = NULL,
sex = FALSE,
ageGroup = NULL,
dateRange = NULL,
interval = "overall",
conceptIdCounts = FALSE,
...
)Arguments
- cdm
A
cdm_referenceobject. Use CDMConnector to create a reference to a database or omock to create a reference to synthetic data.- omopTableName
A character vector of the names of the tables to summarise in the cdm object. Run
clinicalTables()to check the available options.- sample
Either an integer or a character string.
If an integer (n > 0), the function will first sample
ndistinctperson_ids from thepersontable and then subset the input tables to those subjects.If a character string, it must be the name of a cohort in the
cdm; in this case, the input tables are subset to subjects (subject_id) belonging to that cohort.Use
NULLto disable subsetting (default value).
- sex
Logical; whether to stratify results by sex (
TRUE) or not (FALSE).- ageGroup
A list of age groups to stratify the results by. Each element represents a specific age range. You can give them specific names, e.g.
ageGroup = list(children = c(0, 17), adult = c(18, Inf)).- dateRange
A vector of two dates defining the desired study period. Only the
start_datecolumn of the OMOP table is checked to ensure it falls within this range. IfdateRangeisNULL, no restriction is applied.- interval
Time interval to stratify by. It can either be "years", "quarters", "months" or "overall".
- conceptIdCounts
Logical; whether to summarise concept ID counts (
TRUE) or not (FALSE).- ...
additional arguments passed to the OmopSketch functions that are used internally.
Examples
# \donttest{
library(OmopSketch)
library(omock)
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
library(here)
#> here() starts at /home/runner/work/OmopSketch/OmopSketch
cdm <- mockCdmFromDataset(datasetName = "GiBleed", source = "duckdb")
#> ℹ Reading GiBleed tables.
#> ℹ Adding drug_strength table.
#> ℹ Creating local <cdm_reference> object.
#> ℹ Inserting <cdm_reference> into duckdb.
result <- databaseCharacteristics(
cdm = cdm,
sample = 100,
omopTableName = c("drug_exposure", "condition_occurrence"),
sex = TRUE,
ageGroup = list(c(0, 50), c(51, 100)),
interval = "years",
conceptIdCounts = FALSE
)
#> The characterisation will focus on the following OMOP tables: drug_exposure and
#> condition_occurrence
#> The cdm is sampled to 100
#> → Getting cdm snapshot
#> → Getting population characteristics
#> ℹ Building new trimmed cohort
#> Adding demographics information
#> Creating initial cohort
#> Trim sex
#> ✔ Cohort trimmed
#> ℹ Building new trimmed cohort
#> Adding demographics information
#> Creating initial cohort
#> Trim sex
#> Trim age
#> ✔ Cohort trimmed
#> ℹ adding demographics columns
#> ℹ summarising data
#> ℹ summarising cohort general_population
#> ℹ summarising cohort age_group_0_50
#> ℹ summarising cohort age_group_51_100
#> ✔ summariseCharacteristics finished!
#> → Summarising person table
#> → Summarising clinical records
#> ℹ Adding variables of interest to drug_exposure.
#> ℹ Summarising records per person in drug_exposure.
#> ℹ Summarising subjects not in person table in drug_exposure.
#> ℹ Summarising records in observation in drug_exposure.
#> ℹ Summarising records with start before birth date in drug_exposure.
#> ℹ Summarising records with end date before start date in drug_exposure.
#> ℹ Summarising domains in drug_exposure.
#> ℹ Summarising standard concepts in drug_exposure.
#> ℹ Summarising source vocabularies in drug_exposure.
#> ℹ Summarising concept types in drug_exposure.
#> ℹ Summarising concept class in drug_exposure.
#> ℹ Summarising missing data in drug_exposure.
#> ℹ Adding variables of interest to condition_occurrence.
#> ℹ Summarising records per person in condition_occurrence.
#> ℹ Summarising subjects not in person table in condition_occurrence.
#> ℹ Summarising records in observation in condition_occurrence.
#> ℹ Summarising records with start before birth date in condition_occurrence.
#> ℹ Summarising records with end date before start date in condition_occurrence.
#> ℹ Summarising domains in condition_occurrence.
#> ℹ Summarising standard concepts in condition_occurrence.
#> ℹ Summarising source vocabularies in condition_occurrence.
#> ℹ Summarising concept types in condition_occurrence.
#> ℹ Summarising missing data in condition_occurrence.
#> → Summarising observation period
#> → Summarising trends: records, subjects, person-days, age and sex
#> → The number of person-days is not computed for event tables
#> ☺ Database characterisation finished. Code ran in 1 min and 4 sec
#> ℹ 1 table created: "person_sample".
result |>
glimpse()
#> Rows: 73,945
#> Columns: 13
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2,…
#> $ cdm_name <chr> "GiBleed", "GiBleed", "GiBleed", "GiBleed", "GiBleed"…
#> $ group_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ group_level <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ strata_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ strata_level <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ variable_name <chr> "general", "general", "general", "cdm", "cdm", "cdm",…
#> $ variable_level <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ estimate_name <chr> "snapshot_date", "person_count", "vocabulary_version"…
#> $ estimate_type <chr> "date", "integer", "character", "character", "charact…
#> $ estimate_value <chr> "2025-11-21", "100", "v5.0 18-JAN-19", "Synthea synth…
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
shinyCharacteristics(result = result, directory = here())
#> ℹ Creating shiny from provided results.
#> Warning: ! 2 packages are not installed: plotly and shinycssloaders.
cdmDisconnect(cdm = cdm)
# }