Skip to contents

You will obtain information related to the number of records, number of subjects, whether the records are in observation, number of present domains, number of present concepts, missing data and inconsistencies in start date and end date.

Usage

summariseClinicalRecords(
  cdm,
  omopTableName,
  recordsPerPerson = c("mean", "sd", "median", "q25", "q75", "min", "max"),
  conceptSummary = TRUE,
  missingData = TRUE,
  quality = TRUE,
  sex = FALSE,
  ageGroup = NULL,
  dateRange = NULL,
  inObservation = lifecycle::deprecated(),
  standardConcept = lifecycle::deprecated(),
  sourceVocabulary = lifecycle::deprecated(),
  domainId = lifecycle::deprecated(),
  typeConcept = lifecycle::deprecated()
)

Arguments

cdm

A cdm_reference object. Use CDMConnector to create a reference to a database or omock to create a reference to synthetic data.

omopTableName

A character vector of the names of the tables to summarise in the cdm object. Run clinicalTables() to check the available options.

recordsPerPerson

Generates summary statistics for the number of records per person. Set to NULL if no summary statistics are required.

conceptSummary

Logical. If TRUE, includes summaries of concept-level information, including:

  • Domain ID of standard concepts.

  • Type concept ID.

  • Standard vs non-standard concepts.

  • Source vocabulary usage.

missingData

Logical. If TRUE, includes a summary of missing data for relevant fields.

quality

Logical. If TRUE, performs basic data quality checks, including:

  • Percentage of records within the observation period.

  • Number of records with end date before start date.

  • Number of records with start date before the person's birth date.

sex

Logical; whether to stratify results by sex (TRUE) or not (FALSE).

ageGroup

A list of age groups to stratify the results by. Each element represents a specific age range. You can give them specific names, e.g. ageGroup = list(children = c(0, 17), adult = c(18, Inf)).

dateRange

A vector of two dates defining the desired study period. Only the start_date column of the OMOP table is checked to ensure it falls within this range. If dateRange is NULL, no restriction is applied.

inObservation

Deprecated. Use quality = TRUE instead.

standardConcept

Deprecated. Use conceptSummary = TRUE instead.

sourceVocabulary

Deprecated. Use conceptSummary = TRUE instead.

domainId

Deprecated. Use conceptSummary = TRUE instead.

typeConcept

Deprecated. Use conceptSummary = TRUE instead.

Value

A summarised_result object with the results.

Examples

# \donttest{
library(OmopSketch)
library(omock)

cdm <- mockCdmFromDataset(datasetName = "GiBleed", source = "duckdb")
#>  Reading GiBleed tables.
#>  Adding drug_strength table.
#>  Creating local <cdm_reference> object.
#>  Inserting <cdm_reference> into duckdb.

result <- summariseClinicalRecords(
  cdm = cdm,
  omopTableName = "condition_occurrence",
  recordsPerPerson = c("mean", "sd"),
  quality = TRUE,
  conceptSummary = TRUE,
  missingData = TRUE
)
#>  Adding variables of interest to condition_occurrence.
#>  Summarising records per person in condition_occurrence.
#>  Summarising subjects not in person table in condition_occurrence.
#>  Summarising records in observation in condition_occurrence.
#>  Summarising records with start before birth date in condition_occurrence.
#>  Summarising records with end date before start date in condition_occurrence.
#>  Summarising domains in condition_occurrence.
#>  Summarising standard concepts in condition_occurrence.
#>  Summarising source vocabularies in condition_occurrence.
#>  Summarising concept types in condition_occurrence.
#>  Summarising missing data in condition_occurrence.

tableClinicalRecords(result = result)
Summary of condition_occurrence table
Variable name Variable level Estimate name
Database name
GiBleed
condition_occurrence
Number records N 65,332
Number subjects N (%) 2,694 (100.00%)
Subjects not in person table N (%) 0 (0.00%)
Records per person Mean (SD) 24.25 (7.41)
In observation No N (%) 450 (0.69%)
Yes N (%) 64,882 (99.31%)
Domain Condition N (%) 65,332 (100.00%)
Source vocabulary Icd10cm N (%) 479 (0.73%)
No matching concept N (%) 27 (0.04%)
Snomed N (%) 64,826 (99.23%)
Standard concept S N (%) 65,332 (100.00%)
Type concept id Ehr encounter diagnosis N (%) 65,332 (100.00%)
Start date before birth date N (%) 0 (0.00%)
End date before start date N (%) 0 (0.00%)
Column name Condition concept id N missing data (%) 0 (0.00%)
N zeros (%) 0 (0.00%)
Condition end date N missing data (%) 8,652 (13.24%)
Condition end datetime N missing data (%) 8,652 (13.24%)
Condition occurrence id N missing data (%) 0 (0.00%)
N zeros (%) 0 (0.00%)
Condition source concept id N missing data (%) 0 (0.00%)
N zeros (%) 0 (0.00%)
Condition source value N missing data (%) 0 (0.00%)
Condition start date N missing data (%) 0 (0.00%)
Condition start datetime N missing data (%) 0 (0.00%)
Condition status concept id N missing data (%) 0 (0.00%)
N zeros (%) 65,332 (100.00%)
Condition status source value N missing data (%) 65,332 (100.00%)
Condition type concept id N missing data (%) 0 (0.00%)
N zeros (%) 0 (0.00%)
Person id N missing data (%) 0 (0.00%)
N zeros (%) 0 (0.00%)
Provider id N missing data (%) 65,332 (100.00%)
N zeros (%) 0 (0.00%)
Stop reason N missing data (%) 65,332 (100.00%)
Visit detail id N missing data (%) 0 (0.00%)
N zeros (%) 65,332 (100.00%)
Visit occurrence id N missing data (%) 64 (0.10%)
N zeros (%) 0 (0.00%)
cdmDisconnect(cdm = cdm) # }