Estimation.Rmd
Observational healthcare data, comprising administrative claims and electronic health records, present a rich source for generating real-world evidence pertinent to treatment effects that directly impact patient well-being. Within this realm, population-level effect estimation assumes a pivotal role, focusing on elucidating the average causal effects of exposures—such as medical interventions like drug exposures or procedures—on specific health outcomes of interest. Population-level effect estimation delves into two primary realms: direct effect estimation and comparative effect estimation. In direct effect estimation, the focus lies on discerning the effect of an exposure on the risk of an outcome compared to no exposure, while comparative effect estimation aims to delineate the effect of a target exposure against a comparator exposure. By contrasting factual outcomes with counterfactual scenarios—what happened versus what would have occurred under different circumstances—these estimation tasks offer critical insights into treatment selection, safety surveillance, and comparative effectiveness. Whether probing individual hypotheses or exploring multiple hypotheses concurrently, the overarching goal remains consistent: to derive high-quality estimates of causal effects from the intricate fabric of observational healthcare data.
The CohortMethod R package, a cornerstone of population-level estimation within the OHDSI framework, offers a robust methodology for conducting comparative effectiveness research and pharmacoepidemiology studies. Some of the features offered by conducting population-level effect estimation using the CohortMethod module are:
Comparative Effectiveness Research: CohortMethod empowers researchers to conduct comparative effectiveness studies by estimating treatment effects while accounting for potential confounding factors and bias inherent in observational data.
Pharmacoepidemiology and Drug Safety Studies: In pharmacoepidemiology research, CohortMethod facilitates the evaluation of drug safety and effectiveness by quantifying the association between drug exposures and clinical outcomes in real-world populations.
The Self-Controlled Case Series (SCCS) method offers a nuanced approach to investigating the relationship between exposures and outcomes within individual patients over time. SCCS designs are particularly adept at comparing the rate of outcomes during times of exposure to rates during periods of non-exposure, including before, between, and after exposure episodes. By leveraging a Poisson regression that is conditioned on the individual, the SCCS design inherently addresses the question: “Given that a patient has the outcome, is the outcome more likely to occur during exposed time compared to non-exposed time?” The design choices outlined in the method are pivotal for defining an SCCS question, with each choice playing a critical role in the study’s design and outcomes:
Target Cohort: This represents the treatment under investigation. Outcome Cohort: This cohort signifies the outcome of interest. Time-at-Risk: Identifies the specific times when the risk of the outcome is considered, often relative to the start and end dates of the target cohort. Model: Defines the statistical model used to estimate the effect, including adjustments for time-varying confounders if necessary.
One of the SCCS design’s strengths is its robustness to confounding by factors that differ between individuals, as each participant serves as their own control. However, it remains sensitive to time-varying confounding factors. To mitigate this, adjustments can be made for factors such as age, seasonality, and calendar time, enhancing the model’s accuracy.
An advanced variant of the SCCS also considers all other drug exposures recorded in the database, significantly expanding the model’s variables. This approach employs L1-regularization, with cross-validation used to select the regularization hyperparameter for all exposures except the one of interest.
An important assumption of the SCCS is that the observation period’s end is independent of the outcome date. This may not hold true for outcomes that can be fatal, such as stroke. To address this, extensions to the SCCS model have been developed that correct for any dependency between the observation period end and the outcome.
The SelfControlledCaseSeries R package allows the user to perform SCCS analyses in an observational database in the OMOP Common Data Model. Some of the features offered by the SCCS module include:
The SCCS method is particularly applicable in several key areas of epidemiological research and pharmacovigilance:
Drug Safety Surveillance: The SCCS method is widely used in drug safety surveillance to identify adverse effects of medications post-marketing. It is well-suited to detect short-term risks associated with drug exposures, especially where the onset of the adverse event is expected to be temporally close to the exposure.
Vaccine Safety Evaluation: The SCCS design is ideal for assessing the safety of vaccines, especially in evaluating the risk of adverse events following immunization. Its self-controlled nature helps to address concerns about confounding by indication and other biases that can affect observational studies in vaccine safety.
Comparative Effectiveness Research: While primarily designed for evaluating the safety of medical interventions, the SCCS method can also be adapted to compare the effectiveness of different treatments or interventions within the same individual over time, particularly for acute conditions.
Epidemiological Research: More broadly, the SCCS method is used in epidemiological research to study the temporal relationships between exposures and outcomes, offering insights into the causality and mechanisms underlying health conditions and diseases.
Meta-analysis plays a pivotal role in healthcare research by enabling the synthesis of findings from multiple studies to draw more generalizable conclusions. In the context of distributed health data networks, where data are spread across various sites with diverse populations and practices, synthesizing evidence becomes both a challenge and a necessity. The EvidenceSynthesis R package addresses these challenges head-on. It offers a suite of tools designed for combining causal effect estimates and study diagnostics from multiple data sites, all while adhering to stringent patient privacy requirements and navigating the complexities inherent to observational data. This approach enhances the robustness of meta-analytical conclusions and extends the utility of distributed health data for research purposes.
The Meta module which utilizes the EvidenceSynthesis R package makes use of the following features to summarize the results of a study:
The syntheses are generated for both Cohort Method and Self-Controlled Case Series estimation results from the study, providing both information on the diagnostic results within each database and the visualized and tabular results of the meta analysis.
The EvidenceSynthesis package is instrumental in synthesizing evidence from observational studies across multiple healthcare databases. Its significance is underscored in scenarios characterized by:
Comparative Effectiveness Research: Synthesizing evidence from disparate sources allows for stronger, more reliable comparisons of treatment outcomes, enriching the foundation for clinical decision-making.
Safety Surveillance: Aggregated safety data across databases enhance the detection and understanding of adverse drug reactions, contributing to safer patient care.
Policy and Clinical Guidelines Development: Meta-analytical findings informed by comprehensive, real-world data can guide policy formulation and the updating of clinical guidelines, ensuring they are grounded in broad-based evidence.
Addressing Challenges of Small Sample Sizes: The EvidenceSynthesis package notably advances the field by tackling the issue of small sample sizes and zero event counts, which traditional meta-analytical methods often handle poorly. Its innovative use of non-normal likelihood approximations enables more precise effect size estimation under such conditions, ensuring that the insights derived from meta-analyses are both accurate and meaningful. This attribute is particularly beneficial in distributed health data networks, where individual site/database data may be limited but collectively hold significant informational value.
In the Estimation tab of the OHDSI Analysis Viewer, the user can find CohortMethod, SCCS, and/or meta-analysis results (depending on what was included in the analysis specifications).
First, the user selects a target and an outcome of interest. After this is completed, a table will be shown below the input selection, which has 2 parent tabs: “Diagnostics” & “Results”, and 2 child tabs: “CohortMethod” and “SCCS”.
In the Diagnostics tab, the user can find information on whether diagnostics passed for each database, analysis, target, and comparator combination included in their study. These results will be shown for CohortMethod and SCCS, respectively, based on which child tab the user selects.
In the Results tab, the user can find more detailed information on the specific results for each database, analysis, target, and comparator combination included in their study. There is both a tabular and a graphical summary of the results for both CohortMethod and SCCS included underneath the parent Results tab. The user may click on the “View results” link in the “Actions” column to view more detailed results for each individual database, analysis, target, and comparator combination included in their study. More information on what kinds of results are shown can be found in the 1. CohortMethod and 2. Self-Controlled Case Series sections above. The graphical summary renders a forest plot of the meta-analysis results across each of the databases for the given selection.