Function reference

CDM & connection

Core CDM object and connection helpers.

Cdm OMOP CDM reference: holds a mapping of table names to Ibis table expressions
cdm_from_con Create a CDM reference from an Ibis connection.
cdm_from_tables Create a CDM reference from a dict of (table_name -> Ibis table or DataFrame).
cdm_tables Return the logical tables currently attached to the CDM object.
collect Materialize an Ibis expression to a pandas DataFrame.
compute Materialize an Ibis expression into a table in the CDM’s write schema.
insert_table Insert a table (Ibis expr or pyarrow/pandas) into the write schema and return Ibis table ref.
drop_table Drop one or more tables from the write schema.

Cohorts

Cohort set generation and cohort-table helpers.

cohort_collapse Collapse overlapping cohort periods per (cohort_definition_id, subject_id) into contiguous intervals.
generate_cohort_set Generate a cohort set from a cohort definition set (CIRCE JSON or equivalent).
generate_concept_cohort_set Generate a cohort set from one or more concept sets (named list of concept IDs).
new_cohort_table Create an empty cohort table in the CDM’s write schema and register it.
table_refs Return domain_id -> table name and column names (concept_id, start_date, end_date).

Eunomia

Example datasets and download helpers.

download_eunomia_data Download Eunomia data from CDMConnector blob storage (or GitHub for Synthea27NjParquet).
eunomia_dir Return path to a DuckDB file containing the Eunomia dataset.
eunomia_is_available Return True if the Eunomia dataset ZIP is present in the data folder.
example_datasets Return the list of available Eunomia example dataset names.
require_eunomia Ensure the Eunomia dataset is available; download if needed.

Vocabulary

Vocabulary and Hecate-backed search helpers.

search_vocab Search Hecate concepts and return results as a pandas DataFrame.

Patient profiles

Demographics, observation windows, and patient-level intersections.

add_age Add age at index_date (and optional age groups).
add_categories Categorise a numeric (or date) variable into named groups.
add_cdm_name Add a column with the CDM name.
add_cohort_name Left join cohort_set to add cohort_name by cohort_definition_id.
add_concept_name Add concept_name for concept_id column(s). If column is None, all columns ending with _concept_id are used.
add_date_of_birth Add date of birth from person (with optional impose day/month).
add_death_date Add date of death within window (only within same observation period as index_date).
add_death_days Add days to death within window.
add_death_flag Add flag for death within window (1/0).
add_demographics Add demographic characteristics to a table: age, sex, prior/future observation, optional date_of_birth.
add_future_observation Add days (or date) of future observation from index_date to end of observation period.
add_in_observation Add column(s) indicating whether index_date is within observation and (optionally) within window.
add_observation_period_id Add the observation_period_id (ordinal within person) for the observation period containing index_date.
add_prior_observation Add days (or date) of prior observation in the current observation period at index_date.
add_sex Add sex from person (gender_concept_id: 8507 Male, 8532 Female).
add_table_intersect_count Add count of records in table_name within each window.
add_table_intersect_date Add date of first/last record in table_name within each window.
add_table_intersect_days Add days from index_date to first/last record in table_name within each window.
add_table_intersect_field Add a value column from the first/last record in table_name within each window.
add_table_intersect_flag Add flag (1/0) for whether the person has a record in table_name within each window.
add_cohort_intersect_count Add count of cohort entries within each window.
add_cohort_intersect_date Add first/last cohort date within each window.
add_cohort_intersect_days Add days to first/last cohort entry within each window.
add_cohort_intersect_field Add a field value from the first/last cohort record within each window.
add_cohort_intersect_flag Add flag (1/0) for overlap with cohort(s) in target_cohort_table within each window.
add_concept_intersect_count Add count of concept occurrences within each window.
add_concept_intersect_date Add first/last concept date within each window.
add_concept_intersect_days Add days to first/last concept record within each window.
add_concept_intersect_field Add a field value from the first/last concept record within each window.
add_concept_intersect_flag Add flag for presence of concepts within each window.
available_estimates Return DataFrame of estimate_name, estimate_description, estimate_type per variable_type.
end_date_column Return the end date column name for an OMOP table, or None.
filter_cohort_id Filter cohort to rows with cohort_definition_id in cohort_id. If cohort_id is None, return cohort unchanged.
filter_in_observation Keep only rows where index_date falls within an observation period.
mock_patient_profiles Create a minimal mock CDM for testing PatientProfiles (person, observation_period, cohort-like table).
source_concept_id_column Return the source concept_id column for an OMOP table, or None.
standard_concept_id_column Return the standard concept_id column for an OMOP table, or None.
start_date_column Return the start date column name for an OMOP table (e.g. ‘condition_occurrence’ -> ‘condition_start_date’).
summarise_result Summarise variables into a summarised_result-like structure.
variable_types Return a DataFrame with variable_name and variable_type (integer, numeric, date, categorical, logical).

Cohort characteristics

Summaries, formatted tables, and analytic plots for cohort results.

SummarisedResult Container for a summarised result (results table + settings table).
bind_summarised_results Combine multiple SummarisedResult objects.
empty_summarised_result Create an empty SummarisedResult.
estimate_type_choices Return valid estimate_type values.
new_summarised_result Construct a SummarisedResult from a DataFrame and optional settings.
result_columns Return the standard summarised_result column names.
result_package_version Analyze package versions used in a SummarisedResult.
transform_to_summarised_result Convert an arbitrary DataFrame to a SummarisedResult.
summarise_characteristics Summarise characteristics of cohorts in a cohort table.
summarise_cohort_attrition Summarise attrition for cohorts.
summarise_cohort_count Summarise counts for cohorts in a cohort table.
summarise_cohort_overlap Summarise overlap between cohorts.
summarise_cohort_timing Summarise timing between cohort entries for individuals in multiple cohorts.
summarise_large_scale_characteristics Summarise large-scale characteristics for cohorts.
table_characteristics Format a summarise_characteristics (or summarise_cohort_count) result into a table.
table_cohort_attrition Format a summarise_cohort_attrition result into a table.
table_cohort_count Format a summarise_cohort_count result into a table.
table_cohort_overlap Format a summarise_cohort_overlap result into a table.
table_cohort_timing Format a summarise_cohort_timing result into a table.
table_large_scale_characteristics Format a summarise_large_scale_characteristics result into a table.
plot_characteristics Plot characteristics from a summarise_characteristics result.
plot_cohort_attrition Plot cohort attrition as a flow diagram.
plot_cohort_count Plot cohort counts as a bar chart.
plot_cohort_overlap Plot cohort overlap as a stacked bar chart.
plot_cohort_timing Plot timing between cohort entries.
plot_compared_large_scale_characteristics Compare large-scale characteristics across groups as a scatter plot.
plot_large_scale_characteristics Plot large-scale characteristics as a scatter plot of concept frequencies.

Visualisation

visOmopResults-style table and chart helpers.

default_table_options Default table formatting options (mirrors visOmopResults defaultTableOptions).
empty_plot Return an empty Altair chart with a title message.
empty_table Return an empty formatted table (DataFrame or GT object).
format_estimate_name Combine estimate_name and estimate_value into new labels.
format_estimate_value Format the estimate_value column by decimal places and number formatting.
format_header Pivot result so header columns become column headers; estimate_value becomes cells.
format_min_cell_count Replace suppressed count placeholders with min_cell_count from settings.
format_table Format a DataFrame as a table (dataframe, HTML string, or great_tables GT object).
mock_summarised_result Create a mock summarised_result DataFrame for examples (mirrors visOmopResults mockSummarisedResult).
plot_columns Column names available for use in plot aesthetics for a summarised result.
plot_type Supported plot output types.
table_columns Column names that can be used in table header/group for a summarised result.
table_options Return default table options for vis_omop_table / vis_table.
table_style Pre-defined table style names.
table_type Supported table output types.
tidy_summarised_result Tidy a summarised result: keep long form; optionally pivot estimate_value by estimate_name.
vis_omop_table Format a summarised_result DataFrame into a display table (visOmopTable port).
vis_table Format a table (summarised_result-like DataFrame) into a display table.
bar_plot Create a bar plot from a summarised_result DataFrame.
box_plot Create a box plot from pre-computed summary statistics.
customise_text Style text: replace underscores with spaces and convert to sentence case.
scatter_plot Create a scatter/line plot from a summarised_result DataFrame.

Results

Result container and export.

Result Lazy result: holds an Ibis table expression and optional metadata.

Exceptions

Errors raised by the package.

CDMConnectorError Base exception for all CDMConnector errors.
CDMValidationError Raised when CDM validation fails (e.g. missing tables, bad schema).
CohortError Raised for cohort operation failures.
EunomiaError Raised for Eunomia download/path failures.
SourceError Raised for source/connection failures.
TableNotFoundError Raised when a table is not found in the CDM.