OHDSI
Waveform WGThe Waveform Extension has been discussed extensively in CHoRUS Bridge2AI Standards Office Hours. These sessions provide valuable implementation context, live demonstrations, and Q&A with the development team.
Focus: Introduction to waveform and EHR data linkage methodology
Topics Covered: - Overview of challenges in linking multimodal data - Introduction to the waveform_registry concept - Temporal alignment strategies - Patient privacy considerations
Resources: - Video Recording - Transcript
Focus: Continued discussion on waveform-EHR data integration
Topics Covered: - Deep dive into linkage table schemas - Handling multi-file waveform sessions - File format considerations - Metadata extraction best practices
Resources: - Video Recording - Transcript
Focus: Updates on waveform linkage implementation across sites
Topics Covered: - Implementation progress at CHoRUS sites - Lessons learned from early implementations - Common pitfalls and solutions - Site-specific challenges and adaptations
Resources: - Video Recording - Transcript
Focus: Comprehensive session on implementing the 4-table waveform extension
Topics Covered: - Complete walkthrough of the 4-table schema - Table population order and rationale - Field-by-field implementation guidance - ETL pipeline architecture - Quality assurance and validation - Common edge cases and solutions
Resources: - Video Recording - Transcript
Highly Recommended: This session provides the most comprehensive overview of the current Waveform Extension specification.
Focus: Continued discussions on waveform standards and implementation
Topics Covered: - Updates on vocabulary development - Feature extraction standardization - Multi-site data quality comparison - Integration with AI/ML pipelines
Resources: - Transcript
Focus: Recent discussions on waveform extension updates and implementation challenges
Topics Covered: - Latest schema refinements - Community feedback integration - Real-world implementation examples - Future roadmap discussion
Resources: - Transcript
A recurring theme across sessions is the challenge of temporal alignment between waveform data and EHR data: - Clock synchronization between acquisition devices and EHR systems - Timezone handling and UTC standardization - Handling daylight saving time transitions - Temporal precision requirements (second vs. millisecond)
Multiple sessions discuss practical file management considerations: - Storage location strategies (filesystem, object storage, database BLOBs) - Naming conventions for traceability - Handling large file volumes - De-identification of file metadata
Signal metadata extraction is critical and discussed extensively: - Parsing diverse file formats (EDF, WFDB, vendor-specific) - Handling missing or inconsistent metadata - Standardizing channel names across devices - Sampling rate and calibration factor preservation
Developing and maintaining vocabulary mappings: - Waveform type concepts (ECG, EEG, ABP, etc.) - Channel/lead concepts (Lead II, FP1-FP2, etc.) - Feature algorithm concepts (Bazett’s formula, SDNN, etc.) - Device concepts (Phillips IntelliVue, GE CARESCAPE)
Best practices for extracting features from waveforms: - Selecting appropriate time windows - Handling artifacts and low-quality segments - Standardizing algorithm implementations - Linking features to MEASUREMENT/OBSERVATION tables
Several sessions include live demonstrations and code walkthroughs:
A: This is a known edge case. The recommended
approach is to: 1. Split the file into visit-specific segments if
possible 2. If splitting is not feasible, link to the primary visit
where most of the acquisition occurred 3. Use
visit_detail_id to capture sub-visit granularity if
available 4. Document the decision in your ETL documentation
A: You have several options: 1. Use the manufacturer’s specification if the device model is known 2. Infer from the signal (count samples over a known time period) 3. Flag as “unknown” with a quality indicator 4. Reach out to the vendor for technical specifications
A: This depends on your use case and storage constraints: - Raw waveforms: Required for exploratory analyses, novel feature development, reprocessing - Features only: Sufficient for many observational studies if features are well-defined - Hybrid: Store raw for a subset (e.g., recent data, interesting cases) and features for all
The Waveform Extension supports both approaches.
A: Use the
waveform_occurrence_start_datetime and
waveform_occurrence_end_datetime fields: -
Continuous monitoring: One occurrence spanning the
entire monitoring period - Intermittent: Multiple
occurrences with gaps in between - Use
preceding_waveform_occurrence_id to link sequential
monitoring sessions
The working group continues to hold regular meetings and office hours. To join:
We welcome presentations and demos from the community:
Contact houghtaling@ohdsi.org to schedule a presentation.