The OMOP GIS (Geographic Information System) Vocabulary Package is designed to elevate data-driven healthcare research by enabling the integration of spatial, environmental, behavioral, socioeconomic, phenotypic, and toxin-related determinants of health into standardized data structures. This comprehensive framework facilitates a multi-dimensional understanding of health outcomes, accounting for both external environmental exposures and intrinsic patient characteristics.
This package is a vital extension of the OMOP CDM, addressing the growing need to contextualize healthcare data with external environmental and societal factors. Developed and maintained by the GIS Working Group, this package provides vocabularies, scripts, and documentation for terminology integration into existing OHDSI vocabularies.
Find the Delta Vocabulary files for the Vocabulary Package here
Examples of Main Nodes of the SDOH Hierarchy
* Element Relevant To Demographics
* Element Relevant To Education
* Element Relevant To Geographic Location
* Element Relevant To Health
* Element Relevant To Physical Environment
* Element Relevant To Population
* Element Relevant To Social And Community Context
The following data sources have been processed to enrich and integrate spatial, environmental, behavioral, socioeconomic, phenotypic, and toxin-related entities into the OMOP GIS Vocabulary Package.
Concept names adhere to source data specifications or are guided by relevant literature.
Concept codes were either adopted directly from source data or autogenerated for newly developed terms (start with ‘GIS’ prefix).
domain_id | Definition | Example | Exists in OMOP |
---|---|---|---|
Behavioral Feature | Refers to actions or behaviors by individuals that influence health outcomes, often related to lifestyle choices. | Element Relevant To Physical Activity |
NO |
Demographic Feature | Describes the characteristics of a population, such as age, gender, race, or ethnicity. | Race In Population |
NO |
Environmental Feature | Involves natural or man-made environmental factors that affect health and well-being. | Air Quality Index (AQI) |
NO |
Geographic Feature | Describes physical locations, spatial relationships, and geographical characteristics relevant to health studies. | State-County FIPS Code (5-Digit) |
NO |
Healthcare Feature | Involves elements directly related to healthcare services, access, and utilization. | Element Relevant To Health Care |
NO |
Observation | Captures clinical or non-clinical observations relevant to the contextual data. | Total Number Of Households |
YES |
Phenotypic Feature | Refers to observable traits or characteristics of an individual, influenced by genetic and environmental factors. | Element Relevant To Depression |
NO |
Socioeconomic Feature | Involves social and economic factors that impact health, such as income, education, or employment status. | Employment In Population |
NO |
Type Concept | A categorization used to define where the record comes from. | Air Quality Database |
YES |
concept_class_id | Definition | Example |
---|---|---|
ADI Construct | Represents a high-level conceptual framework within the Area Deprivation Index (ADI) for analyzing socioeconomic factors. | Area Deprivation Index (ADI) |
ADI Item | Refers to specific elements or data points that make up the ADI framework. | % Families Below Federal Poverty Level |
AHRQ Construct | A conceptual framework from the Agency for Healthcare Research and Quality (AHRQ) related to social determinants of health. | Food Access |
AHRQ Determinant | A measurable factor derived from AHRQ’s data, influencing health outcomes. | Crime And Violence |
AHRQ Item | Specific data points or elements within the AHRQ framework. | Total Number Of Households |
COI Construct | A framework within the Child Opportunity Index (COI) for assessing child well-being and opportunity. | Economic Resource Index |
COI Determinant | A measurable factor in the COI influencing child development and health. | Access To Green Spaces |
COI Item | Specific data points or factors that make up the COI. | Mean Estimated 8-Hour Average Ozone Concentration |
EJI EBM Item | Refers to an Environmental Burden Measure (EBM) within the Environmental Justice Index (EJI). | Ambient Concentrations Of Diesel PM/M3 |
EJI HVM Item | Refers to a Health Vulnerability Measure (HVM) within the EJI. | Percentage Of Individuals With Cancer |
EJI Item | A general item from the EJI, integrating environmental and health vulnerability data. | Census Tract Code |
Exposome Target | Represents specific biological targets within the exposome (a measure of all environmental exposures across a lifetime). | Tissue-type plasminogen activator |
Exposome Transporter | Refers to biological transporters related to the exposome, responsible for moving substances within an organism. | SLCO2B1 (OATP2B1, OATP-B) |
Exposure Type Concept | A category defining types of exposures relevant to toxicology and environmental health studies. | Census Data |
Geometry Relationship | Refers to spatial relationships within geographic data, such as spatial proximity or overlap. | Near/Proximity to |
Geometry Type | Defines the type of geometry used in spatial data. | Polygon |
GIS Measure | A specific metric or quantitative value derived from GIS data. | Estimate |
Location | Refers to the specific geographic location or spatial point in GIS data. | Administrative Boundary |
SDG Goal | Represents one of the United Nations’ Sustainable Development Goals (SDGs) related to health, environment, and equity. | Significantly reduce all forms of violence and related death rates everywhere |
SDG Indicator | A measurable indicator for tracking progress toward SDG goals. | Proportion of bodies of water with good ambient water quality |
SDOH Construct | A high-level framework for understanding social determinants of health (SDOH) and their impact on population health. | Neighborhood Quality |
SDOH Determinant | A specific social or economic factor that directly influences health outcomes. | Air Quality Index (AQI) |
SDOH Item | A specific data element within the SDOH framework. | The Air Quality Index For The Day For PM2.5 |
SDOHO Construct | A conceptual framework from the Social Determinants of Health Ontology (SDOHO) focused on categorizing health determinants. | Smoking |
SDOHO Determinant | A measurable factor in the SDOHO framework affecting health. | Alcohol Use |
SDOHO Item | A specific measurable element in the SDOHO framework. | Occupational Prestige Score |
SDOHO Value | A specific value or outcome within the SDOHO framework that reflects health disparities or social conditions. | Intersex |
SEDH Construct | A conceptual framework for Social and Environmental Determinants of Health (SEDH). | Social Capital Index |
SEDH Item | A data element within the SEDH framework. | Veteran Segments By Census Block Group |
Substance | A chemical or biological substance relevant to environmental exposure or toxicology. | Nicotine |
SVI Construct | A conceptual framework for the Social Vulnerability Index (SVI), representing factors that make communities vulnerable. | Household Characteristics |
SVI Determinant | A specific factor in the SVI that directly impacts a community’s resilience to health risks or environmental hazards. | Housing Type & Transportation |
SVI Item | A specific data point within the SVI framework. | Persons Below 150% Poverty Estimate MOE |
Construct: Represents conceptual or behavioral elements that are often measured through subjective or indirect means. Constructs are used to characterize complex or abstract social, psychological, or environmental phenomena that contribute to understanding health outcomes but are not necessarily directly measurable or causal by themselves.
Examples include:
These examples reflect behaviors, relationships, and environmental factors that influence health outcomes, but they are abstract and typically involve interpretation, surveys, or proxies for measurement.
Determinant: A specific, measurable factor that has a more direct influence on health outcomes. Determinants are often quantifiable and can be linked more concretely to causes or risk factors affecting a person’s health, such as economic status, education, or access to healthcare.
Examples include:
Item: A specific, measurable data point or element that is used to evaluate a larger construct or determinant.
Examples include:
Item Value: A specific measurable value or state
that the item can take.
The item values might include “employed,” “unemployed,”
“self-employed,” “retired”, etc.
If a full semantic match is identified in OMOP, GIS codes are mapped to the corresponding standard concepts and reclassified as non-standard. If no match is found, GIS codes are retained as standard concepts.
vocabulary_id | valid_start_date |
---|---|
OMOP Exposome | source field updated_at in ‘MM-DD-YYYY’ format |
OMOP Exposome (cas is null) | 09-14-2024 |
OMOP SDOH (concept_class_id_1 ~ ‘SDOHO’) | 01-01-2022 |
OMOP SDOH (concept_class_id_1 ~ ‘ADI’) | 01-01-2018 |
OMOP SDOH (concept_class_id_1 ~ ‘AHRQ’) | 01-01-2022 |
OMOP SDOH (concept_class_id_1 ~ ‘COI’) | 01-01-2020 |
OMOP SDOH (concept_class_id_1 ~ ‘EJI’) | 01-01-2022 |
OMOP SDOH (concept_class_id_1 ~ ‘SEDH’) | 01-01-2021 |
OMOP SDOH (concept_class_id_1 ~ ‘SVI’) | 01-01-2018 |
OMOP SDOH (concept_class_id_1 ~ ‘SDG’) | 03-01-2017 |
All other cases | 09-14-2024 |
relationship_id | reverse_relationship_id | Meaning |
---|---|---|
Locates in cell | Cell contains | Indicates that a certain agent or substance is found within or targets a cellular entity. |
Locates in tissue | Tissue contains | Suggests that a certain agent or substance is present within or targets a specific tissue type. |
Impacts on process | Impacted by | Signifies that an agent or substance exerts an influence on a specific process. |
Affects biostructure | Affected by | Suggests that an agent or substance has an impact on a certain biological structure. |
Maps to | Mapped from | Indicates a relationship where a concept is equated to or represented as a standard OMOP concept. |
Is a | Subsumes | Hierarchical relationship where a concept is a subset or instance of a more general concept. |
Has associated finding | Asso finding of | Indicates a relationship between a concept and an associated finding related to it. |
Has relat context | Relat context of | Describes the contextual relationship between two related concepts. |
Has geometry | Is geometry of | Represents the spatial or geometric relationship between an entity and its geographic or spatial structure. |
Examples: * Hierarchical relationships: ‘Is a’ - ‘Subsumes’: ‘Polygon’ - ‘Is a’ - ‘2D (Two-Dimensional) Geometry’ / ‘2D (Two-Dimensional) Geometry’ - ‘Subsumes’ - ‘Polygon’ * Supplemental GIS-specific relationships: e.g. ‘Is geometry of’ - ‘Has geometry’: ‘LineString’ - ‘Is geometry of’ - ‘International Border’ / ‘International Border’ - ‘Has geometry’ - ‘LineString’
target_vocabulary_id | number of associations |
---|---|
OMOP Exposome | 82,150 |
OMOP SDOH | 6,738 |
RxNorm | 4,418 |
RxNorm Extension | 2,221 |
SNOMED | 1,769 |
LOINC | 776 |
OMOP GIS | 423 |
ICD10CM | 122 |
PPI | 56 |
OMOP Genomic | 49 |
OMOP Extension | 32 |
OSM | 25 |
UK Biobank | 24 |
Type Concept | 24 |
CPT4 | 10 |
HCPCS | 9 |
Nebraska Lexicon | 3 |
Race | 2 |
ATC | 2 |
The OMOP GIS Vocabulary Package is built and maintained through a structured Google Spreadsheet that supports collaborative editing, centralized curation, and version control. This spreadsheet functions as the backbone of the vocabulary development process, enabling distributed subject-matter experts, curators, and developers to participate in real-time. It is composed of multiple interrelated tabs that each fulfill a specialized role in the construction of a standardized, computable terminology layer.
The structure adheres to the principles of transparency, auditability, and semantic alignment with the OMOP CDM. The spreadsheet is logically organized into several functional layers:
Captures raw terminology originating from environmental, geographic, exposomic, or socio-behavioral data sources. Each record includes a unique source code, human-readable description, vocabulary ID, domain assignment, concept class identifier, and provenance information such as date of review, expert attribution, ORCID ID, and review status.
Establishes the semantic correspondences between the collected source terms and OMOP standard concepts. Each mapping contains:
Maps to
,
Is a
)skos:exactMatch
, skos:narrowMatch
,
skos:broadMatch
, skos:relatedMatch
)Supports parent-child relationships among concepts and extends the OMOP CDM’s ontology-like capabilities. This is particularly important for representing aggregate social constructs (e.g., Area Deprivation Index) and nested features.
Defines custom Domains, Concept
Classes, Vocabularies, and
Relationships that expand the CDM’s expressivity in the
context of real-world data. These extensions are consistently registered
and versioned (e.g., OMOP GIS || 20250424
).
Each term progresses through a structured lifecycle: initial entry,
expert validation, decision logging, and integration readiness. Fields
such as change_required
, author_comment
, and
status
support prioritization and triage workflows.
The mapping layer leverages SSSOM-style predicates, enabling:
Is geometry of
↔︎ Has geometry
) support
symmetric reasoningGoogle Apps Script automates the parsing, change detection, and transformation of the spreadsheet into vocabulary delta tables consumable by OMOP ETL workflows. This enables continuous deployment of vocabulary updates without manual intervention. Extensions are serialized into OMOP-compatible formats and managed according to OHDSI governance protocols.
The spreadsheet functions as both a collaborative workspace and a vocabulary staging environment. Contributors may propose new terms or mappings by adding rows to designated tabs. Each entry is subject to transparent peer review, with review states tracked via controlled values. Reviewers are encouraged to document decisions with ORCID and institutional affiliation.
For contributions, questions, or access requests, please contact the GIS Vocabulary Coordination Team:
This section outlines the implementation framework for the OMOP GIS Vocabulary Package, detailing both the underlying ontology architecture and the practical processes for vocabulary ingestion and deployment. By combining semantic formalism with operational scalability, the implementation ensures that spatial and contextual vocabularies are conceptually aligned with the OMOP CDM and readily usable in real-world analytics environments.
The OMOP GIS Ontology utilizes the GIS Vocabulary Package as its foundational layer, which is collaboratively maintained through a Google Spreadsheet system integrated with Google Apps Scripts and GitHub-based automation pipelines. This ontology serves as a semantic scaffold for spatial and contextual reasoning in health data, enabling structured representation and analysis of geographically-linked, environmental, social, and behavioral determinants of health. It also functions as a machine-interpretable layer that supports standardized analytics, ontology-informed feature generation, and federated ETL workflows across distributed data networks.
In general, an ontology is a structured framework for representing knowledge as a set of concepts within a domain and the relationships between those concepts. Ontologies enable formal semantics, reasoning, and integration across diverse datasets by providing consistent definitions and hierarchical structure.
Within the OMOP GIS framework, the ontology performs a similar function - defining and categorizing geographic features, environmental exposures, social determinants of health, and their interrelationships - using the language and constraints of the OMOP CDM. It extends the existing OMOP vocabulary model to support location-aware analyses and geospatial semantics without breaking conformance with OMOP’s relational architecture.
The GIS Ontology is constructed through the following components and processes:
Source Definition Layer: Concepts and relationships are entered and curated in a structured Google Spreadsheet format. This spreadsheet includes validated fields for source code, concept class, domain, mappings, predicates, and metadata. Collaborative access and semantic protections are enforced using Google Apps Scripts, which regulate row-level editability.
Version-Controlled Vocabulary Pipeline: Approved concepts and mappings are automatically synchronized from Google Sheets to GitHub using scheduled Apps Script tasks. This process creates a persistent and auditable version history while simultaneously preparing mapping data for downstream transformation.
Ontology Transformation Pipeline: A GitHub Action orchestrates a multi-step workflow that converts the spreadsheet-based mappings into relational OMOP-compatible vocabulary tables. This includes:
concept
, concept_relationship
, and
concept_synonym
records with assigned
concept_id
s in the reserved space (>2,000,000,000)Two Azure-hosted components support this automation: a Container App acting as a virtual GitHub runner and a Flexible Postgres Server that stores the ontology’s relational tables. These components ensure that updates can be executed securely and reproducibly.
The ontology is materialized through a suite of relational “delta” tables. Each table mirrors a specific component of the OMOP vocabulary schema, while systematically extending it to accommodate geospatial logic, including location-referenced features, environmental indices, and spatially-resolved determinants. For example:
concept_delta.csv
: Defines both standard and
non-standard GIS concepts, including representatives of new domains like
Geographic Feature
, Environmental Feature
, and
Socioeconomic Feature
.concept_relationship_delta.csv
: Encodes semantic links
using relationships such as Has geometry
,
Affects biostructure
, and Locates in cell
,
facilitating ontology-driven inferences.concept_ancestor_delta.csv
: Reconstructs hierarchical
ancestry for reasoning across spatial or categorical groupings.concept_synonym_delta.csv
: Includes synonyms to support
flexible querying across GIS, public health, and environmental
terminology variants.This table set collectively reproduces an ontological graph within a relational schema, enabling semantic linkage between OMOP-standard concepts and domain-specific enhancements required for contextualized health research.
The OMOP GIS Ontology integrates community-based term curation, semantic standardization via SSSOM predicates, and automated deployment pipelines to construct a modular, versioned vocabulary system. This infrastructure supports not only geospatial analysis but also cross-domain reasoning on determinants of health, exposures, and environment. It positions the OMOP CDM for expanded utility in real-world evidence generation that incorporates place, population context, and environmental burden.
The OMOP GIS Ontology can be integrated into a local OMOP CDM instance by combining standard vocabulary files obtained from Athena OHDSI with curated delta tables provided via GitHub. This integration enables structured support for spatial, environmental, and contextual reasoning through GIS-aligned concepts and relationships, while preserving OMOP CDM conformance. The process leverages relational structures familiar to OMOP implementers and is compatible with federated ETL workflows and AI-driven pipelines.
To begin the installation, ensure you have:
dev_gis
) separate from your production environment, to
safely test integration and validate results before promotion.Your OMOP schema must contain all required core tables:
concept
, concept_ancestor
,
concept_class
, concept_relationship
,
concept_synonym
domain
, relationship
,
vocabulary
, drug_strength
If missing, create them using the OMOP CDM DDL.
To prepare for GIS enrichment, create delta tables via:
All vocabularies listed below are mandatory. Do not skip any.
These vocabularies are referenced in the delta tables and are essential for resolving mappings and relationships. Partial ingestion will result in structural or referential integrity errors.
Select the following vocabularies from Athena OHDSI Download section, ensuring any license-restricted vocabularies (e.g., CPT4) are only selected if your organization holds a valid license:
Required Vocabulary |
---|
ATC |
CPT4* |
HCPCS |
ICD10CM |
LOINC |
Nebraska Lexicon |
OMOP Extension |
OSM |
PPI |
RxNorm |
RxNorm Extension |
SNOMED |
UK Biobank |
After selecting the vocabularies, click Download Vocabularies, name the bundle, and download the resulting ZIP file directly from the Athena website once it is ready. Unzip the archive and confirm that the following files are present:
Expected File |
---|
CONCEPT.csv |
CONCEPT_ANCESTOR.csv |
CONCEPT_CLASS.csv |
CONCEPT_RELATIONSHIP.csv |
CONCEPT_SYNONYM.csv |
DOMAIN.csv |
DRUG_STRENGTH.csv |
RELATIONSHIP.csv |
VOCABULARY.csv |
Download the delta tables
from the GIS
Vocabulary GitHub repository. These include:
Delta Table |
---|
CONCEPT_DELTA.CSV |
CONCEPT_ANCESTOR_DELTA.CSV |
CONCEPT_CLASS_DELTA.CSV |
CONCEPT_RELATIONSHIP_DELTA.CSV |
CONCEPT_SYNONYM_DELTA.CSV |
DOMAIN_DELTA.CSV |
RELATIONSHIP_DELTA.CSV |
VOCABULARY_DELTA.CSV |
MAPPING_METADATA.CSV |
SOURCE_TO_CONCEPT_MAP.CSV |
Note: Files such as
restore.sql
andupdate_log.csv
are not required for ingestion.
Import all downloaded Athena .csv
files into the
corresponding OMOP vocabulary tables using your preferred SQL
client.
Recommended tools: Use PostgreSQL
COPY
command viapsql
, or GUI tools such as DBeaver or pgAdmin for loading the files.
Important formatting requirements: - Files must use UTF-8 character encoding. - Comma should be used as the delimiter. - Text fields should be enclosed in double quotes.
Match the CSV files to OMOP tables as follows:
CSV File | → OMOP Table |
---|---|
CONCEPT.csv | → CONCEPT |
CONCEPT_ANCESTOR.csv | → CONCEPT_ANCESTOR |
CONCEPT_CLASS.csv | → CONCEPT_CLASS |
CONCEPT_RELATIONSHIP.csv | → CONCEPT_RELATIONSHIP |
CONCEPT_SYNONYM.csv | → CONCEPT_SYNONYM |
DOMAIN.csv | → DOMAIN |
DRUG_STRENGTH.csv | → DRUG_STRENGTH |
RELATIONSHIP.csv | → RELATIONSHIP |
VOCABULARY.csv | → VOCABULARY |
After upload, run QA checks.
Insert delta rows into the already existing tables using: - insert_delta_tables_into_omop.sql
This step inserts data from the GIS delta files into the corresponding OMOP vocabulary tables. The mapping between each delta file and its target table is shown below:
Delta File | → Target Table |
---|---|
concept_delta.csv | → CONCEPT |
concept_ancestor_delta.csv | → CONCEPT_ANCESTOR |
concept_class_delta.csv | → CONCEPT_CLASS |
concept_relationship_delta.csv | → CONCEPT_RELATIONSHIP |
concept_synonym_delta.csv | → CONCEPT_SYNONYM |
domain_delta.csv | → DOMAIN |
relationship_delta.csv | → RELATIONSHIP |
vocabulary_delta.csv | → VOCABULARY |
mapping_metadata.csv | → MAPPING_METADATA |
source_to_concept_map.csv | → SOURCE_TO_CONCEPT_MAP |
Important: Always validate your integration in a development schema before applying changes to a production vocabulary schema. Ensure referential integrity and uniqueness constraints are preserved.
Use check_delta_tables_inserts.sql to verify the successful application of the delta content. This includes validation of record counts, relationship integrity, and domain coverage.
After completing this workflow, your OMOP CDM instance will:
EXTERNAL_EXPOSURE
and
SOURCE_TO_CONCEPT_MAP
tables.Use information in the Vocabulary QA section to confirm completeness and correctness of the loaded data.
For feedback or bug reports, please open an issue on GitHub.
This checklist is designed to validate new GIS Vocabulary Package releases in OMOP CDM format.
Ensure the following delta tables are populated unless explicitly expected to be empty:
concept_delta
concept_relationship_delta
concept_synonym_delta
(if synonyms are defined)concept_ancestor_delta
(if hierarchical terms are
used)source_to_concept_map
(if supplemental mappings are
included)vocabulary_delta
, domain_delta
,
relationship_delta
, and concept_class_delta
only contain new or modified entries.vocabulary_id
,
vocabulary_name
, vocabulary_reference
,
etc.).standard_concept IS NULL
should have at
least one outbound Maps to
or Mapped from
relationship.standard_concept = 'S'
should not map to
other standard concepts unless it’s a self-map.Has geometry
,
Locates in cell
) but must be flagged for review.(concept_id_1, relationship_id, concept_id_2)
combinations.Maps to
targets for the same source must be
clinically or hierarchically justified.Ensure all target_concept_id
values from the source
mapping table:
concept
or
concept_delta
tableinvalid_reason IS NULL
Procedure
→
target Measurement
) unless justified.concept_code
format. Ensure they match
declared source_code
values.concept_name
entries unless codes
differ.Ensure these fields are always populated:
concept_id
concept_name
domain_id
vocabulary_id
concept_code
valid_start_date
valid_end_date
source_code
in the source mapping table should
exist in concept_delta
.source_description
should match a
concept_name
.source_description_synonym
should match a
concept_synonym_name
.source_to_concept_map
must contain complete mappings:
source_code
source_concept_id
target_concept_id
concept_synonym_delta
entry must:
concept_id
from
concept_delta
concept_synonym_name
concept_name
concept_ancestor_delta
.concept_name
between source and
derived vocabularies.% source codes mapped
).csv
or .md
QA report to
accompany the vocabulary release.These checks complement automated scripts and should be validated by vocabulary experts and domain specialists, especially for novel environmental or spatial concepts.
This section describes a structured approach for validating the usability of the GIS Vocabulary Package in real-world OMOP CDM integration scenarios. The process focuses on semantic coverage, geospatial linkage, and practical implementation using environmental exposure data such as EJI, EPA air quality, or other standardized GIS sources.
5.1.1. Define Validation Objectives
EXTERNAL_EXPOSURE
analytics.5.1.2. Acquire and Preprocess GIS Data
GEOID
(Census Tract), ZIP
, or
latitude/longitude
.5.1.3. Link GIS Data to OMOP Locations
location_id
, state
, county
,
zip
, lat
, lon
,
location_source_value
.GEOID
→ direct match on
location_source_value
.state
, county
, zip
) → match
against parsed GEOID
.lat/lon
→ reverse geocode to determine tract or
area.location_history
is implemented (CDM v6.0+),
consider temporal changes.5.1.4. Map GIS Variables to OMOP Concepts
exposure_concept_id
in the GIS Vocabulary.unit_concept_id
via OHDSI Athena.source_variable
→ concept_id
,
unit_concept_id
,
value_as_concept_id
/value_as_number
.5.1.5. Populate the external_exposure
Table
Field Name | Description | Data Example |
---|---|---|
exposure_occurrence_id |
Unique identifier for each exposure record | 123456 |
location_id |
Foreign key linking to the location table, indicating
where exposure occurred |
789 |
person_id |
Foreign key linking to the person table, identifying
the individual exposed |
100234 |
cohort_definition_id |
(Optional) Links to a defined cohort in research studies | 25 |
exposure_concept_id |
Standard OMOP concept_id representing the type of exposure | 2052498173 (Percentile Rank Of Annual Mean Days Above
PM2.5 Regulatory Standard - 3-Year Average) |
exposure_start_date |
Date when the exposure event started | 2024-01-15 |
exposure_end_date |
Date when the exposure event ended (NULL if ongoing exposure) | NULL (ongoing) |
exposure_type_concept_id |
Concept ID defining the origin of the exposure record | 2052499258 (Government Data) |
exposure_relationship_concept_id |
Concept ID describing how exposure relates to the person | NULL |
exposure_source_concept_id |
Source-specific concept ID before standardization to OMOP | 90000001 |
exposure_source_value |
Raw exposure value from source data | "EPL_PM" |
exposure_relationship_source_value |
Raw value describing the exposure-person relationship | NULL |
dose_unit_source_value |
Source unit before standardization | NULL |
quantity |
Number of exposure occurrences (if applicable) | 1 |
modifier_source_value |
(Optional) Modifier describing the exposure type or intensity | NULL |
operator_concept_id |
Concept ID defining operator logic (e.g., < ,
> , = ) |
NULL |
value_as_number |
Numerical value of the exposure (e.g., concentration level) | 0.8503 |
value_as_concept_id |
Concept ID for categorical exposure values | NULL |
unit_concept_id |
Concept ID representing the measurement unit | NULL |
Ensure complete and consistent population of required fields. Non-null exposure values and units are critical for downstream analytics.
5.2.1. Coverage
5.2.2. Interoperability
external_exposure
table schema support all
relevant metadata?5.2.3. Practical Usability
external_exposure
using real-world
datasets without data loss or transformation ambiguity?source_variable
→
concept_id
)external_exposure
)% mapped
,
unmapped terms
)This usability validation should be iteratively improved and coordinated across GIS WG stakeholders. Contributions welcome!