Skip to contents

Introduction

This vignette demonstrates how to use table and plotting functions provided by MeasurementDiagnostics to visualise results.

We use the package mock data so examples are fully reproducible.

library(MeasurementDiagnostics)
library(dplyr)
library(omopgenerics) 
library(ggplot2)

cdm <- mockMeasurementDiagnostics()

# Example codelist we'll use in the examples
alkaline_phosphatase_codes <- list("alkaline_phosphatase" = c(3001467L, 45875977L))

Create diagnostics results

We call summariseMeasurementUse() once and obtain histogram bins for all numeric variables. This returns a summarised_result containing all the diagnostics checks, summary estimates, and density and histogram estimates to visualise distributions of numeric variables; all for the overall measurements codelist and stratified by sex.

result <- summariseMeasurementUse(
  cdm = cdm,
  codes = alkaline_phosphatase_codes,
  bySex = TRUE,
  byYear = FALSE,
  byConcept = FALSE,
  histogram = list(
    days_between_measurements = list(
      "0-30" = c(0, 30), "31-90" = c(31, 90), "91-365" = c(91, 365), "366+" = c(366, Inf)
    ),
    measurements_per_subject = list(
      "0" = c(0, 0), "1" = c(1, 1), "2-3" = c(2, 3), "4+" = c(4, 1000)
    ),
    value_as_number = list(
      "low" = c(0, 5.999), "mid" = c(6, 10.999), "high" = c(11, Inf)
    )
  )
)

Tables

There is one table function corresponding to each diagnostic check:

You can customise which columns appear in the header, which are used as grouping columns, and which to hide.

# 1. Measurement summary table (timings / counts)
tableMeasurementSummary(
  result, 
  header = c("codelist_name", "sex"),
  hide = c("cdm_name", "domain_id")
)
Codelist name
alkaline_phosphatase
Variable name Variable level Estimate name
Sex
overall Female Male
Number subjects N (%) 67 (67.00%) 40 (40.00%) 27 (27.00%)
Days between measurements Median [Q25 – Q75] 249 [67 – 645] 240 [53 – 1,133] 267 [81 – 415]
Range 8 to 2,886 8 to 2,886 8 to 2,743
Measurements per subject Median [Q25 – Q75] 1.00 [1.00 – 2.00] 1.00 [1.00 – 2.00] 1.00 [1.00 – 2.00]
Range 1.00 to 4.00 1.00 to 4.00 1.00 to 3.00

# 2. Numeric-value summary table (values recorded as numbers)
tableMeasurementValueAsNumber(result)
CDM name Unit concept name Unit concept ID Estimate name
Sex
overall Female Male
alkaline_phosphatase
mock database kilogram 9529 N 50 33 17
Median [Q25 – Q75] 8.77 [7.07 – 10.48] 8.12 [6.60 – 10.22] 9.13 [8.26 – 11.17]
Q05 – Q95 5.70 – 11.84 5.58 – 11.60 6.64 – 11.83
Q01 – Q99 5.43 – 12.11 5.41 – 11.99 6.08 – 12.11
Range 5.36 to 12.18 5.36 to 12.04 5.94 to 12.18
Missing value, N (%) 2 (4.00%) 2 (6.06%) 0 (0.00%)
NA - N 50 27 23
Median [Q25 – Q75] 8.77 [7.10 – 10.44] 8.55 [6.85 – 10.01] 8.92 [7.39 – 10.88]
Q05 – Q95 5.77 – 11.77 5.75 – 11.32 6.18 – 11.80
Q01 – Q99 5.50 – 12.04 5.61 – 11.94 5.59 – 11.93
Range 5.44 to 12.11 5.58 to 12.11 5.44 to 11.96
Missing value, N (%) 3 (6.00%) 3 (11.11%) 0 (0.00%)

# 3. Concept-value summary table (values recorded as concepts)
tableMeasurementValueAsConcept(result)
CDM name Value as concept name Value as concept ID Estimate name
Sex
overall Female Male
alkaline_phosphatase
mock database Low 4267416 N (%) 34 (34.00%) 16 (26.67%) 18 (45.00%)
High 4328749 N (%) 33 (33.00%) 26 (43.33%) 7 (17.50%)
NA NA N (%) 33 (33.00%) 18 (30.00%) 15 (37.50%)

Plots

The plotting helpers allow to plot certain types of graphics, while giving flexibility for variables to use for colouring, facetting, and which to have in the horizontla and vertical axes. They return ggplot objects, which allows further customisation using standard ggplot2 layers.

Measurement summary

plotMeasurementSummary() visualises days_between_measurements, and measurements_per_subject. Supported plot type are "boxplot", "barplot", and "densityplot".

The variable specified in y must be either “days_between_measurements” or “measurements_per_subject” as it is used to filter which of the summary results to plot.

result |>
  plotMeasurementSummary(
    x = "codelist_name",
    y = "days_between_measurements",
    plotType = "boxplot"
  )

result |>
  plotMeasurementSummary(
    x = "sex",
    y = "measurements_per_subject",
    plotType = "boxplot",
    colour = "sex",
    facet = NULL
  ) +
  theme(legend.position = "none")

If we got density estimates we can also use densityplot for these variables. To choose which variable to plot, we use the y argument, while the x argument is ignored for this plot type.

result |>
  plotMeasurementSummary(
    plotType = "densityplot",
    colour = "sex", 
    facet = NULL
  )

result |>
  plotMeasurementSummary(
    y = "measurements_per_subject",
    plotType = "densityplot",
    colour = "sex", 
    facet = NULL
  )

Since we got specific bin-counts to plot histograms for these variables, we can also use plotType = "barplot"

result |>
  plotMeasurementSummary(
    x = "variable_level",
    plotType = "barplot",
    colour = "variable_level", 
    facet = "sex"
  )

result |>
  plotMeasurementSummary(
    y = "measurements_per_subject",
    plotType = "barplot",
    colour = "sex", 
    facet = "variable_level"
  )

Numeric-value summary

plotMeasurementValueAsNumber() visualises distributions of numeric measurement values. We demonstrate the three plot types, similar to the measurement summary plots.

boxplot

result |> 
  plotMeasurementValueAsNumber(
    x = "sex",
    plotType = "boxplot",
    facet = "unit_concept_name",
    colour = "sex"
  )

densityplot

result |> 
  plotMeasurementValueAsNumber(
    plotType = "densityplot",
    facet = "unit_concept_name",
    colour = "sex"
  )

barplot

result |> 
  plotMeasurementValueAsNumber(
    x = "unit_concept_name",
    plotType = "barplot",
    facet = c("sex"),
    colour = "variable_level"
  )

Concept-value summary

plotMeasurementValueAsConcept() visualises concept-coded measurement values and their frequencies. Next we plot counts for each concept value in the codelist.

result |>
  plotMeasurementValueAsConcept(
    x = "count",
    y = "variable_level",
    facet = "cdm_name",
    colour = "sex"
  ) +
  ylab("Value as Concept Name")

Instead of counts, we can also plot the percentage for each concept:

result |>
  plotMeasurementValueAsConcept(
    x = "variable_level",
    y = "percentage",
    facet = "cdm_name",
    colour = "sex"
  ) +
  xlab("Value as Concept Name") 

Visualisation with other packages

Shiny Apps with OmopViewer

The OmopViewer package supports results produced by MeasurementDiagnostics and provides a user-friendly way to quickly generate a Shiny application to explore diagnostic results in an interactive way.

For example, the following code exports a static Shiny app that allows users to navigate the tables and plots generated in this vignette.

library(OmopViewer)
exportStaticApp(result = result, directory = tempdir())

Customisation of plots and tables with visOmopResults

Tables and plots in MeasurementDiagnostics are generated using the visOmopResults package. Users who wish to create custom tables or visualisations directly from a summarised_result object can do so by leveraging the functions provided by this package.

Application of MeasurementDiagnostics in PhenotypeR

MeasurementDiagnostics is integrated into the PhenotypeR package. When cohorts are defined based on measurement codes, PhenotypeR automatically applies summariseCohortMeasurementUse() to generate measurement diagnostics during cohort construction, using the codelists linked to each cohort.

This integration allows users to assess measurement codelists and cohorts as part of a broader phenotype development workflow.