Summary

Level: CONCEPT
Context: Validation
Category: Plausibility
Subcategory: Atemporal
Severity: Characterization ✔

Description

For descendants of CONCEPT_ID conceptId (conceptName), the number and percent of records associated with patients with an implausible gender (correct gender = plausibleGenderUseDescendants).

Definition

This check will count the number of records for which the person’s gender is implausible given the concept for the record. For a given gender-specific concept (e.g., prostate cancer) and its descendants, the check will identify records for which the associated person has an implausible gender (e.g., a female with prostate cancer). There are currently 4 instances of this check - female condition concepts; male condition concepts; female procedure concepts; and male procedure concepts.

  • Numerator: The number of rows with a gender-specific concept whose associated person has an implausible gender.
  • Denominator: The number of rows with a gender-specific concept.
  • Related CDM Convention(s): https://ohdsi.github.io/Themis/populate_gender_concept_id.html
  • CDM Fields/Tables: CONDITION_OCURRENCE, PROCEDURE_OCCURRENCE
  • Default Threshold Value: 5%

User Guidance

A failure of this check indicates one of the following scenarios:

  • The person’s gender is wrong
  • The gender-specific concept is wrong
  • The person changed genders and the concept was plausible at the time it was recorded

Violated rows query

SELECT
  cdmTable.@cdmFieldName,
  cdmTable.@sourceConceptIdField, -- x_source_concept_id for the table of interest (condition_occurrence or procedure_occurrence)
  cdmTable.@sourceValueField, -- x_source_value for the table of interest
  COUNT(*)
FROM @cdmDatabaseSchema.@cdmTableName cdmTable
  JOIN @cdmDatabaseSchema.person p ON cdmTable.person_id = p.person_id
  JOIN @vocabDatabaseSchema.concept_ancestor ca ON ca.descendant_concept_id = cdmTable.@cdmFieldName
WHERE ca.ancestor_concept_id IN (@conceptId)
  AND p.gender_concept_id <> {@plausibleGenderUseDescendants == 'Male'} ? {8507} : {8532} 
GROUP BY 1,2,3

The above query should help to identify if a mapping issue is the cause of the failure. If the source value and source concept ID are correctly mapped to a standard concept, then the issue may be that the person has the incorrect gender, or that the finding is a true data anomaly. Examples of true anomalies include:

  • Occasional stray code (e.g., due to typo in EHR).
  • Newborn codes recorded in the mother’s record (e.g., circumcision).
  • Gender reassignment procedures (e.g., penectomy and prostatectomy in patients with acquired female gender). NOTE that this scenario is technically a violation of the OMOP CDM specification, since the CDM requires that the gender_concept_id represents a person’s sex at birth. For more information on this convention, see https://ohdsi.github.io/Themis/populate_gender_concept_id.html

ETL Developers

Concept mapping issues must be fixed. Ensure as well that source codes are being correctly extracted from the source data. If the CDM accurately represents the source data, then remaining failures should be documented for users of the CDM.

Data Users

Persons with implausible gender should not be included in analysis unless it can be confirmed with the data provider that these represent cases of gender reassignment, and your analysis does not assume that the gender_concept_id represents sex at birth.