
Summarise database characteristics
Source:vignettes/database_characteristics.Rmd
database_characteristics.RmdIntroduction
In this vignette, we explore how the OmopSketch function
databaseCharacteristics() and
shinyCharacteristics() can serve as a valuable tool for
characterising databases containing electronic health records mapped to
the OMOP Common Data Model.
Create a mock cdm
We begin by loading the necessary packages and creating a mock CDM
using the mockOmopSketch() function:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(OmopSketch)
cdm <- mockOmopSketch()
cdm
#>
#> ── # OMOP CDM reference (duckdb) of mockOmopSketch ─────────────────────────────
#> • omop tables: cdm_source, concept, concept_ancestor, concept_relationship,
#> concept_synonym, condition_occurrence, death, device_exposure, drug_exposure,
#> drug_strength, measurement, observation, observation_period, person,
#> procedure_occurrence, visit_occurrence, vocabulary
#> • cohort tables: -
#> • achilles tables: -
#> • other tables: -Summarise database characteristics
The databaseCharacteristics() function provides a
comprehensive summary of the CDM, returning a summarised
result that includes:
A general database snapshot, using
summariseOmopSnapshot()A characterisation of the population in observation, built using the CohortConstructor and CohortCharacteristics packages
A summary of the observation period table using
summariseObservationPeriod()andsummariseInObservation()A data quality assessment of the clinical tables using
summariseMissingData()A characterisation of the clinical tables with
summariseClinicalRecords()andsummariseRecordCount()
result <- databaseCharacteristics(cdm)Selecting tables to characterise
By default, the following OMOP tables are included in the characterisation: person, observation_period, visit_occurrence, condition_occurrence, drug_exposure, procedure_occurrence, device_exposure, measurement, observation, death.
You can customise which tables to include in the analysis by
specifying them with the omopTableName argument.
result <- databaseCharacteristics(cdm, omopTableName = c("drug_exposure", "condition_occurrence"))Stratifying by Sex
To stratify the characterisation results by sex, set the
sex argument to TRUE:
result <- databaseCharacteristics(cdm,
omopTableName = c("drug_exposure", "condition_occurrence"),
sex = TRUE
)Stratifying by Age Group
You can choose to characterise the data stratifying by age group by creating a list defining the age groups you want to use.
result <- databaseCharacteristics(cdm,
omopTableName = c("drug_exposure", "condition_occurrence"),
ageGroup = list(c(0, 50), c(51, 100))
)Filtering by date range and time interval
Use the dateRange argument to limit the analysis to a
specific period. Combine it with the interval argument to
stratify results by time. Valid values for interval include “overall”
(default), “years”, “quarters”, and “months”:
result <- databaseCharacteristics(cdm,
interval = "years",
dateRange = as.Date(c("2010-01-01", "2018-12-31"))
)Including Concept Counts
To include concept counts in the characterisation, set
conceptIdCounts = TRUE:
result <- databaseCharacteristics(cdm,
conceptIdCounts = TRUE
)Visualise the characterisation results
To explore the characterisation results interactively, you can use
the shinyCharacteristics() function. This function
generates a Shiny application in the specified directory,
allowing you to browse, filter, and visualise the results through an
intuitive user interface.
shinyCharacteristics(result = result, directory = "path/to/your/shiny")Customise the Shiny App
You can customise the title, logo, and theme of the Shiny app by setting the appropriate arguments:
title: The title displayed at the top of the applogo: Path to a custom logo (must be in SVG format)theme: A custom Bootstrap theme (e.g., using bslib::bs_theme())
shinyCharacteristics(
result = result, directory = "path/to/my/shiny",
title = "Characterisation of my data",
logo = "path/to/my/logo.svg",
theme = "bslib::bs_theme(bootswatch = 'flatly')"
)An example of the Shiny application generated by
shinyCharacteristics() can be explored here,
where the characterisation of several synthetic datasets is
available.