Skip to contents

This function simulates condition occurrences for individuals within a specified cohort. It helps create a realistic dataset by generating condition records for each person, based on the number of records specified per person.The generated data are aligned with the existing observation periods to ensure that all conditions are recorded within valid observation windows.

Usage

mockConditionOccurrence(cdm, recordPerson = 1, seed = NULL)

Arguments

cdm

A `cdm_reference` object that should already include 'person', 'observation_period', and 'concept' tables.This object is the base CDM structure where the condition occurrence data will be added. It is essential that these tables are not empty as they provide the necessary context for generating condition data.

recordPerson

An integer specifying the expected number of condition records to generate per person.This parameter allows the simulation of varying frequencies of condition occurrences among individuals in the cohort, reflecting the variability seen in real-world medical data.

seed

An optional integer used to set the seed for random number generation, ensuring reproducibility of the generated data.If provided, it allows the function to produce the same results each time it is run with the same inputs.If 'NULL', the seed is not set, resulting in different outputs on each run.

Value

Returns the modified `cdm` object with the new 'condition_occurrence' table added. This table includes the simulated condition data for each person, ensuring that each record is within the valid observation periods and linked to the correct individuals in the 'person' table.

Examples

# \donttest{
library(omock)

# Create a mock CDM reference and add condition occurrences
cdm <- mockCdmReference() |>
  mockPerson() |>
  mockObservationPeriod() |>
  mockConditionOccurrence(recordPerson = 2)

# View the generated condition occurrence data
print(cdm$condition_occurrence)
#> # A tibble: 120 × 16
#>    condition_concept_id person_id condition_start_date condition_end_date
#>  *                <int>     <int> <date>               <date>            
#>  1               194152         7 2007-02-23           2010-05-13        
#>  2               194152         3 1986-11-10           1990-08-10        
#>  3               194152         5 2017-11-30           2017-12-30        
#>  4               194152         1 1989-03-16           1995-12-07        
#>  5               194152        10 1963-04-22           1999-08-14        
#>  6               194152         4 2010-04-21           2012-05-04        
#>  7               194152         2 2001-11-13           2004-07-22        
#>  8               194152         1 1985-03-13           2006-10-20        
#>  9               194152         1 1985-06-20           2001-01-17        
#> 10               194152         1 1988-08-15           1997-06-16        
#> # ℹ 110 more rows
#> # ℹ 12 more variables: condition_occurrence_id <int>,
#> #   condition_type_concept_id <int>, condition_start_datetime <dttm>,
#> #   condition_end_datetime <dttm>, condition_status_concept_id <int>,
#> #   stop_reason <chr>, provider_id <int>, visit_occurrence_id <int>,
#> #   visit_detail_id <int>, condition_source_value <chr>,
#> #   condition_source_concept_id <int>, condition_status_source_value <chr>
# }