vignettes/checks/fkDomain.Rmd
fkDomain.Rmd
Level: Field check
Context: Verification
Category: Conformance
Subcategory: Value
Severity: CDM convention ⚠
The number and percent of records that have a value in the cdmFieldName field in the cdmTableName table that do not conform to the fkDomain domain.
It is often the case that standard concept fields in the OMOP CDM should belong to a certain domain. All possible domains are listed in the vocabulary table DOMAIN and the expected domain for CDM fields are listed as part of the CDM documentation. For example, all concepts in the field PERSON.gender_concept_id should conform to the gender domain.
domain_id
. This numerator specifically
excludes concept_id 0condition_concept_id
;
gender_concept_id
; etc.)OHDSI tools and analyses assume that standard concepts in event tables and demographic data conform to the relevant domain. If incorrect concepts are allowed to persist, a study package could run on this table but may not produce expected results.
To assess the impacted rows/tables, you may run a query like the one below:
-- @cdmTableName.@cdmFieldName is the standard concept ID field in the table
-- @cdmTableName.@cdmTablePk is the primary key field in the table
SELECT
concept.concept_id,
concept.domain_id,
concept.concept_name,
concept.concept_code,
COUNT(DISTINCT @cdmTableName.@cdmTablePk),
COUNT(DISTINCT @cdmTableName.person_id)
FROM @cdmDatabaseSchema.@cdmTableName cdmTable
LEFT JOIN @vocabSchema.concept on @cdmTableName.@cdmFieldName = concept.concept_id
AND concept.domain_id != {fkDomain} AND concept.concept_id != 0
GROUP BY concept.concept_id, concept.domain_id, concept.concept_name, concept.concept_code
Recommended actions: - Identify the specific concepts in the table
that have an incorrect domain_id
- Investigate the ETL
process that moves records to the tables based on the standard concept
ID domain. Likely there is an error that is letting records through with
the incorrect domain_id
- Ultimately the ETL process should
be improved so that the correct rows are moved to the correct tables
based on their domain
If this check is failing it means that there is likely an error in the ETL process that builds the domain tables. If there are records in a table with standard concepts in the wrong domain then cohorts and analyses will run but they will return unexpected or erroneous results.
You may characterize the potential impact of the erroneous domain sorting on your analysis by running a query like the one above. Use the results to see what concepts in a table were incorrectly sorted, and how many events/patients are impacted.