CDM Table name: stem_table

The STEM table is a staging area where CPRD source codes like Read codes will first be mapped to concept_ids. The STEM table itself is an amalgamation of the OMOP event tables to facilitate record movement. This means that all fields present across the OMOP event tables are present in the STEM table. After a record is mapped and staged, the domain of the concept_id dictates which OMOP table (Condition_occurrence, Drug_exposure, Procedure_occurrence, Measurement, Observation, Device_exposure) the record will move to. Please see the STEM -> CDM mapping files for a description of which STEM fields move to which STEM tables.

Reading from CPRD.Additional

In the below table, only the relevant STEM fields are shown. Any fields that do not have a mapping from the CPRD Clinical table are not included.

Observation values will also be drawn from the ‘Additional’ file. This file contains categorical and continuous data values, units, dates, medcodes (conditions), and prodcodes (drugs) pertaining to diseases including allergy, asthma, hypertension, epilepsy and diabetes and lifestyle data (smoking, alcohol use, diet and exercise), as well as child health surveillance, death, elder health care, examination, immunization, maternity, pathology and blood group information. There is also score/scale data (e.g. The Patient Health Questionnaire (PHQ-9) which measures severity of depression) in the ‘Additional’ file which will be captured. The query CPRD_Additional_Setup.sql should be used to create an intermediate table and all mapping to the CDM will be done from this table, referred to below as add_int. Each record in the add_int table will become between one and seven records in the measurement table. Please refer to CPRD_Additional_Setup.sql for comments and rationale for how the add_int table is created.

To map the values in the additional table to standard concepts concatenate the fields add_int.enttype + ‘-‘ + add_int.category + ‘-‘ + add_int.description + ‘-‘ + add_int.data. This will retain the information from the entity table about the record and the specific data field being mapped. Please refer to appendix 2 which is a table showing the enttypes and data_field descriptions from the additional table and counts of each.

These concatenated source values will then be mapped to standard concepts using the mapping file created in Usagi. The source_vocabulary_id is ‘JNJ_CPRD_ADD_ENTTYPE’ and the query used to prepare the data for mapping is CPRD_Additional_Descriptions.sql.

Destination Field	Source field	Logic	Comment field
id			Autogenerate
domain_id		This should be the domain_id of the standard concept in the concept_id field. If a read code is mapped to concept_id 0, put the domain_id as Observation.
person_id	patid	Use patid to lookup Person_id
visit_occurrence_id	patid adid	Look up visit_occurrence_id based on the unique combination of patid, consid, and eventdate. To find consid and eventdate use adid to link back to the clinical table.	Use the Visit_occurrence_id assigned in the previous visit definition step
provider_id	staffid	Map staffid to provider_id
start_datetime		Join back to the Clinical table using adid and set the eventdate as the start_datetime and set the time to midnight.
concept_id		Map the source value (add_int.enttype + ‘-‘ + add_int.category + ‘-‘ + add_int.description + ‘-‘ + add_int.data) to a concept using the SOURCE_TO_STANDARD_QUERY with the filters: WHERE source_vocabulary_id = ‘JNJ_CPRD_ADD_ENTTYPE’ AND standard_concept = ‘S’ AND target_invalid_reason is NULL
source_value		Concatenate add_int.enttype + ‘-‘ + add_int.category + ‘-‘ + add_int.description + ‘-‘ + add_int.data.	This will retain the information from the entity table about the record and the specific data field being mapped. Please refer to appendix 2 which is a table showing the enttypes and data_field descriptions from the additional table and counts of each.
source_concept_id		0
type_concept_id		Use 32817 - EHR
unit_concept_id		Look up add_int.unit_source_value in the CONCEPT table where vocabulary_id = ‘UCUM’ and standard_concept = ‘S’ and invalid_reason is NULL.
unit_source_value	add_int.unit_source_value
start_date	add_int.eventdate		For the additional table, the adid is used to link back to the clinical table to get the eventdate.
end_date	NULL
value_as_number	add_int.value_as_number
value_as_string	add_int.value_as_string
value_as_concept_id			If the last part of the source value says ‘Read code for condition’ then map the code in add_int.value_as_string to a standard concept using the SOURCE_TO_STANDARD query with the filters: WHERE source_vocabulary_id = ‘Read’ AND standard_concept = ‘S’ AND invalid_concept is NULL If the last part of the source value says ‘Drug code’ then map the code in add_int.value_as_string to a standard concept using the SOURCE_TO_STANDARD query with the filters: WHERE source_vocabulary_id = ‘Gemscript’ AND standard_concept = ‘S’ AND invalid_concept is NULL Otherwise, if the value in add_int.qualifier_source_value is not null then lookup the values in add_int.qualifier_source_value in the CONCEPT table where domain_id=’ Meas Value’ and vocabulary_id=’ LOINC’ and standard_concept = ‘S’ and invalid_concept is NULL.
value_source_value	If not NULL, put add_int.qualifier_source_value here.