#Gaia Data Models
This is the specification document for the gaiaDB Data Model.
This data model is still under development and likely to
change. Each table is represented with a high-level description
and ETL conventions that should be followed. This is continued with a
discussion of each field in each table, any conventions related to the
field, and constraints that should be followed (like primary key,
foreign key, etc). Should you have questions please feel free to visit
the forums or the github issue page.
Table Description
This table contains records that catalog external (or local) web-hosted entities. All source data in gaiaDB must be referenced in this table.
User Guide
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
ETL Conventions
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Gaia Field | User Guide | ETL Conventions | Datatype | Required | Primary Key | Foreign Key | FK Table | FK Domain |
---|---|---|---|---|---|---|---|---|
data_source_uuid | int4 | Yes | Yes | No | ||||
org_id | varchar(100) | Yes | No | No | ||||
org_set_id | varchar(100) | Yes | No | No | ||||
dataset_name | varchar(100) | Yes | No | No | ||||
dataset_version | varchar(100) | Yes | No | No | ||||
geom_type | varchar(100) | No | No | No | ||||
geom_spec | text | No | No | No | ||||
boundary_type | varchar(100) | No | No | No | ||||
has_attributes | int4 | No | No | No | ||||
download_method | varchar(100) | Yes | No | No | ||||
download_subtype | varchar(100) | Yes | No | No | ||||
download_data_standard | varchar(100) | Yes | No | No | ||||
download_filename | varchar(100) | Yes | No | No | ||||
download_url | varchar(100) | Yes | No | No | ||||
download_auth | varchar(100) | No | No | No | ||||
documentation_url | varchar(100) | No | No | No |
Table Description
This table contains records that describe the distinct variables in a data source enabling downstream data integrations. All variables from attribute source data must be catalogued in this table.
User Guide
NA NA NA NA NA NA
ETL Conventions
NA NA NA NA NA NA
Gaia Field | User Guide | ETL Conventions | Datatype | Required | Primary Key | Foreign Key | FK Table | FK Domain |
---|---|---|---|---|---|---|---|---|
variable_source_id | serial4 | Yes | Yes | No | ||||
geom_dependency_uuid | int4 | No | No | Yes | data_source | |||
variable_name | varchar | Yes | No | No | ||||
variable_desc | text | Yes | No | No | ||||
data_source_uuid | int4 | Yes | No | Yes | data_source | |||
attr_spec | text | Yes | No | No |
Table Description
A programmatically derived index table of all the attribute source datasets included in the data_source table.
User Guide
NA NA NA NA NA NA
ETL Conventions
NA NA NA NA NA NA
Gaia Field | User Guide | ETL Conventions | Datatype | Required | Primary Key | Foreign Key | FK Table | FK Domain |
---|---|---|---|---|---|---|---|---|
attr_index_id | numeric | Yes | Yes | No | ||||
variable_source_id | numeric | Yes | No | Yes | variable_source | |||
attr_of_geom_index_id | numeric | Yes | No | Yes | geom_index | |||
database_schema | varchar(255) | Yes | No | No | ||||
table_name | varchar(255) | Yes | No | No | ||||
data_source_id | numeric | Yes | No | Yes | data_source |
Table Description
A programmatically derived index table of all the geometry source datasets included in the data_source table.
User Guide
NA NA NA NA NA NA NA NA NA
ETL Conventions
NA NA NA NA NA NA NA NA NA
Gaia Field | User Guide | ETL Conventions | Datatype | Required | Primary Key | Foreign Key | FK Table | FK Domain |
---|---|---|---|---|---|---|---|---|
geom_index_id | numeric | Yes | Yes | No | ||||
data_type_id | numeric | No | No | No | ||||
data_type_name | varchar(255) | Yes | No | No | ||||
geom_type_concept_id | numeric | No | No | Yes | concept | |||
geom_type_source_value | varchar(255) | No | No | No | ||||
database_schema | varchar(255) | Yes | No | No | ||||
table_name | varchar(255) | Yes | No | No | ||||
table_desc | varchar(255) | Yes | No | No | ||||
data_source_id | numeric | Yes | No | Yes | data_source |
Table Description
This table is a template for the standardized attribute table that get created.
User Guide
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
ETL Conventions
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Gaia Field | User Guide | ETL Conventions | Datatype | Required | Primary Key | Foreign Key | FK Table | FK Domain |
---|---|---|---|---|---|---|---|---|
attr_record_id | serial4 | Yes | Yes | No | ||||
geom_record_id | int4 | Yes | No | Yes | geom_template | |||
variable_source_record_id | int4 | Yes | No | Yes | variable_source | |||
attr_concept_id | int4 | No | No | Yes | concept | |||
attr_start_date | date | Yes | No | No | ||||
attr_end_date | date | Yes | No | No | ||||
value_as_number | float8 | No | No | No | ||||
value_as_string | varchar | No | No | No | ||||
value_as_concept_id | int4 | No | No | Yes | concept | |||
unit_concept_id | int4 | No | No | Yes | concept | |||
unit_source_value | varchar | No | No | No | ||||
qualifier_concept_id | int4 | No | No | Yes | concept | |||
qualifier_source_value | varchar | No | No | No | ||||
attr_source_concept_id | int4 | No | No | Yes | concept | |||
attr_source_value | varchar | Yes | No | No | ||||
value_source_value | varchar | Yes | No | No |
Table Description
This table is a template for the standardized geometry tables that get created.
User Guide
NA NA NA NA NA NA NA
ETL Conventions
NA NA NA NA NA NA NA
Gaia Field | User Guide | ETL Conventions | Datatype | Required | Primary Key | Foreign Key | FK Table | FK Domain |
---|---|---|---|---|---|---|---|---|
geom_record_id | serial4 | Yes | Yes | No | ||||
geom_name | varchar | Yes | No | No | ||||
geom_source_coding | varchar | Yes | No | No | ||||
geom_source_value | varchar | Yes | No | No | ||||
geom_wgs84 | geometry | No | No | No | ||||
geom_local_epsg | int4 | Yes | No | No | ||||
geom_local_value | geometry | Yes | No | No |
Table Description
This table contains identifier and text address from OMOP Location table records along with their associated geocoded, point geometry.
User Guide
NA NA NA
ETL Conventions
NA NA NA
Gaia Field | User Guide | ETL Conventions | Datatype | Required | Primary Key | Foreign Key | FK Table | FK Domain |
---|---|---|---|---|---|---|---|---|
location_id | integer | Yes | Yes | No | ||||
address | varchar(255) | Yes | No | No | ||||
geometry | geometry | Yes | No | No |
Table Description
This table is a copy of the OMOP Location_History table.
User Guide
NA NA NA NA NA NA
ETL Conventions
NA NA NA NA NA NA
Gaia Field | User Guide | ETL Conventions | Datatype | Required | Primary Key | Foreign Key | FK Table | FK Domain |
---|---|---|---|---|---|---|---|---|
location_id | integer | Yes | No | Yes | location | |||
relationship_type_concept_id | integer | Yes | No | Yes | concept | |||
domain_id | integer | Yes | No | No | ||||
entity_id | integer | Yes | No | No | ||||
start_date | date | Yes | No | No | ||||
end_date | date | No | No | No |
Table Description
This CDM Extension table is used to represent person-related transformations of data from Gaia and to interface with ATLAS and OHDSI tool stack from an OMOP CDM database.
User Guide
The unique key given to a social or environmental exposure for a Person The LOCATION_ID of the Person for whom the exposure is associated. The PERSON_ID of the Person for whom the exposure is associated. The EXPOSURE_CONCEPT_ID field is recommended for primary use in analyses, and must be used for network studies. This is the standard concept mapped from the source value which represents a exposure. Use this date to determine the start date of the exposure. NA Use this date to determine the end date of the exposure. NA This field identifies the origin of the exposure record (e.g. Census, EHR, Environmental data, Geospatial data, Satellite imagery, GIS mapping, Sensor network, Mobile device geolocation, LiDAR) This field can be used to determine the spatiotemporal relationship between the source Exposure and the Person This field can be used to determine the original source of place-based exposure data This field houses the verbatim name of the original source of place-based exposure data. NA NA NA NA The meaning of Concept?4172703?for ?=? is identical to omission of a OPERATOR_CONCEPT_ID value. Since the use of this field is rare, it?s important when devising analyses to not to forget testing for the content of this field for values different from =. This is the numerical value of the Exposure, if available. If the raw data gives a categorial result for exposures those values are captured and mapped to standard concepts in the ?Exposure Value? domain. UNIT_SOURCE_VALUES should be mapped to a Standard Concept in the Unit domain that best represents the unit as given in the source data.
ETL Conventions
Each derived instance of an exposure should be assigned this unique key. NA NA The CONCEPT_ID to which the source exposure is mapped. This mapping should be integrated into the variable_source record and automatically populated in this record. The date range of the exposure should represent the temporal overlap between the place-based exposure data point and the LOCATION_ID’s location_history record. NA The date range of the exposure should represent the temporal overlap between the place-based exposure data point and the LOCATION_ID’s location_history record. NA The CONCEPT_ID to which the exposure’s data source type is mapped. This mapping should be integrated into the data_source record and automatically populated in this record. The CONCEPT_ID to which the relationship between the Exposure and the Person is mapped. This mapping should be automatically populated in this record. The CONCEPT_ID to which the exposure’s data source is mapped. This mapping should be integrated into the data_source record and automatically populated in this record. This name is mapped to a Standard Exposure Source Concept and the original name is stored here for reference. NA NA NA NA NA This value should be integrated into the variable_source record and automatically populated in this record. This mapping should be integrated into the variable_source record and automatically populated in this record. This mapping should be integrated into the variable_source record and automatically populated in this record.
Gaia Field | User Guide | ETL Conventions | Datatype | Required | Primary Key | Foreign Key | FK Table | FK Domain |
---|---|---|---|---|---|---|---|---|
exposure_occurrence_id | The unique key given to a social or environmental exposure for a Person | Each derived instance of an exposure should be assigned this unique key. | integer | Yes | Yes | No | ||
location_id | The LOCATION_ID of the Person for whom the exposure is associated. | integer | Yes | No | Yes | location | ||
person_id | The PERSON_ID of the Person for whom the exposure is associated. | integer | Yes | No | Yes | person | ||
exposure_concept_id | The EXPOSURE_CONCEPT_ID field is recommended for primary use in analyses, and must be used for network studies. This is the standard concept mapped from the source value which represents a exposure. | The CONCEPT_ID to which the source exposure is mapped. This mapping should be integrated into the variable_source record and automatically populated in this record. | integer | Yes | No | Yes | concept | |
exposure_start_date | Use this date to determine the start date of the exposure. | The date range of the exposure should represent the temporal overlap between the place-based exposure data point and the LOCATION_ID’s location_history record. | date | Yes | No | No | ||
exposure_start_datetime | datetime | No | No | No | ||||
exposure_end_date | Use this date to determine the end date of the exposure. | The date range of the exposure should represent the temporal overlap between the place-based exposure data point and the LOCATION_ID’s location_history record. | date | Yes | No | No | ||
exposure_end_datetime | datetime | No | No | No | ||||
exposure_type_concept_id | This field identifies the origin of the exposure record (e.g. Census, EHR, Environmental data, Geospatial data, Satellite imagery, GIS mapping, Sensor network, Mobile device geolocation, LiDAR) | The CONCEPT_ID to which the exposure’s data source type is mapped. This mapping should be integrated into the data_source record and automatically populated in this record. | integer | Yes | No | Yes | concept | Type Concept |
exposure_relationship_concept_id | This field can be used to determine the spatiotemporal relationship between the source Exposure and the Person | The CONCEPT_ID to which the relationship between the Exposure and the Person is mapped. This mapping should be automatically populated in this record. | integer | Yes | No | Yes | concept | |
exposure_source_concept_id | This field can be used to determine the original source of place-based exposure data | The CONCEPT_ID to which the exposure’s data source is mapped. This mapping should be integrated into the data_source record and automatically populated in this record. | integer | No | No | Yes | concept | |
exposure_source_value | This field houses the verbatim name of the original source of place-based exposure data. | This name is mapped to a Standard Exposure Source Concept and the original name is stored here for reference. | varchar(50) | No | No | No | ||
exposure_relationship_source_value | varchar(50) | No | No | No | ||||
dose_unit_source_value | varchar(50) | No | No | No | ||||
quantity | integer | No | No | No | ||||
modifier_source_value | varchar(50) | No | No | No | ||||
operator_concept_id | The meaning of Concept?4172703?for ?=? is identical to omission of a OPERATOR_CONCEPT_ID value. Since the use of this field is rare, it?s important when devising analyses to not to forget testing for the content of this field for values different from =. | integer | No | No | Yes | concept | ||
value_as_number | This is the numerical value of the Exposure, if available. | This value should be integrated into the variable_source record and automatically populated in this record. | float | No | No | No | ||
value_as_concept_id | If the raw data gives a categorial result for exposures those values are captured and mapped to standard concepts in the ?Exposure Value? domain. | This mapping should be integrated into the variable_source record and automatically populated in this record. | integer | No | No | Yes | concept | |
unit_concept_id | UNIT_SOURCE_VALUES should be mapped to a Standard Concept in the Unit domain that best represents the unit as given in the source data. | This mapping should be integrated into the variable_source record and automatically populated in this record. | integer | No | No | Yes | concept | Unit |