The primary objective of the omock package is to generate mock OMOP CDM (Observational Medical Outcomes Partnership Common Data Model) data to facilitating the testing of various packages within the OMOPverse ecosystem. For more information on the package please see our paper in Journal of Open Source Software.
Du et al., (2025). omock: A R package for Mock Data Generation for the Observational Medical Outcomes Partnership Common Data Model. Journal of Open Source Software, 10(113), 8178, https://doi.org/10.21105/joss.08178
Introduction
You can install the development version of omock using:
# install.packages("devtools")
devtools::install_github("OHDSI/omock")
Example
With omock we can quickly make a simple mock of OMOP CDM data.
We first start by making an empty cdm reference. This includes the person and observation tables (as they are required) but they are currently empty.
cdm <- emptyCdmReference(cdmName = "mock")
cdm$person %>%
glimpse()
#> Rows: 0
#> Columns: 18
#> $ person_id <int>
#> $ gender_concept_id <int>
#> $ year_of_birth <int>
#> $ month_of_birth <int>
#> $ day_of_birth <int>
#> $ birth_datetime <date>
#> $ race_concept_id <int>
#> $ ethnicity_concept_id <int>
#> $ location_id <int>
#> $ provider_id <int>
#> $ care_site_id <int>
#> $ person_source_value <chr>
#> $ gender_source_value <chr>
#> $ gender_source_concept_id <int>
#> $ race_source_value <chr>
#> $ race_source_concept_id <int>
#> $ ethnicity_source_value <chr>
#> $ ethnicity_source_concept_id <int>
cdm$observation_period %>%
glimpse()
#> Rows: 0
#> Columns: 5
#> $ observation_period_id <int>
#> $ person_id <int>
#> $ observation_period_start_date <date>
#> $ observation_period_end_date <date>
#> $ period_type_concept_id <int>
Once we have have our empty cdm reference, we can quickly add a person table with a specific number of individuals.
cdm <- cdm %>%
omock::mockPerson(nPerson = 1000)
cdm$person %>%
glimpse()
#> Rows: 1,000
#> Columns: 18
#> $ person_id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,…
#> $ gender_concept_id <int> 8532, 8532, 8507, 8532, 8507, 8507, 8532, …
#> $ year_of_birth <int> 1979, 1996, 1967, 1954, 1972, 1981, 1956, …
#> $ month_of_birth <int> 1, 4, 1, 12, 11, 12, 11, 3, 12, 11, 3, 4, …
#> $ day_of_birth <int> 16, 29, 12, 14, 2, 31, 5, 30, 29, 13, 27, …
#> $ race_concept_id <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ ethnicity_concept_id <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ birth_datetime <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ location_id <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ provider_id <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ care_site_id <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ person_source_value <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ gender_source_value <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ gender_source_concept_id <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ race_source_value <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ race_source_concept_id <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ ethnicity_source_value <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ ethnicity_source_concept_id <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
We can then fill in the observation period table for these individuals.
cdm <- cdm %>%
omock::mockObservationPeriod()
cdm$observation_period %>%
glimpse()
#> Rows: 1,000
#> Columns: 5
#> $ observation_period_id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1…
#> $ person_id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1…
#> $ observation_period_start_date <date> 2002-01-05, 2014-06-15, 2012-10-08, 197…
#> $ observation_period_end_date <date> 2013-04-09, 2016-09-25, 2015-12-20, 198…
#> $ period_type_concept_id <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …