Function reference

CDM & connection

Core CDM object and connection helpers.

Cdm	OMOP CDM reference: holds a mapping of table names to Ibis table expressions
cdm_from_con	Create a CDM reference from an Ibis connection.
cdm_from_tables	Create a CDM reference from a dict of (table_name -> Ibis table or DataFrame).
cdm_tables	Return the logical tables currently attached to the CDM object.
collect	Materialize an Ibis expression to a pandas DataFrame.
compute	Materialize an Ibis expression into a table in the CDM’s write schema.
insert_table	Insert a table (Ibis expr or pyarrow/pandas) into the write schema and return Ibis table ref.
drop_table	Drop one or more tables from the write schema.

Cohorts

Cohort set generation and cohort-table helpers.

cohort_collapse	Collapse overlapping cohort periods per (cohort_definition_id, subject_id) into contiguous intervals.
generate_cohort_set	Generate a cohort set from a cohort definition set (CIRCE JSON or equivalent).
generate_concept_cohort_set	Generate a cohort set from one or more concept sets (named list of concept IDs).
new_cohort_table	Create an empty cohort table in the CDM’s write schema and register it.
table_refs	Return domain_id -> table name and column names (concept_id, start_date, end_date).

Eunomia

Example datasets and download helpers.

download_eunomia_data	Download Eunomia data from CDMConnector blob storage (or GitHub for Synthea27NjParquet).
eunomia_dir	Return path to a DuckDB file containing the Eunomia dataset.
eunomia_is_available	Return True if the Eunomia dataset ZIP is present in the data folder.
example_datasets	Return the list of available Eunomia example dataset names.
require_eunomia	Ensure the Eunomia dataset is available; download if needed.

Vocabulary

Vocabulary and Hecate-backed search helpers.

search_vocab

Search Hecate concepts and return results as a pandas DataFrame.

Patient profiles

Demographics, observation windows, and patient-level intersections.

add_age	Add age at index_date (and optional age groups).
add_categories	Categorise a numeric (or date) variable into named groups.
add_cdm_name	Add a column with the CDM name.
add_cohort_name	Left join cohort_set to add cohort_name by cohort_definition_id.
add_concept_name	Add concept_name for concept_id column(s). If column is None, all columns ending with _concept_id are used.
add_date_of_birth	Add date of birth from person (with optional impose day/month).
add_death_date	Add date of death within window (only within same observation period as index_date).
add_death_days	Add days to death within window.
add_death_flag	Add flag for death within window (1/0).
add_demographics	Add demographic characteristics to a table: age, sex, prior/future observation, optional date_of_birth.
add_future_observation	Add days (or date) of future observation from index_date to end of observation period.
add_in_observation	Add column(s) indicating whether index_date is within observation and (optionally) within window.
add_observation_period_id	Add the observation_period_id (ordinal within person) for the observation period containing index_date.
add_prior_observation	Add days (or date) of prior observation in the current observation period at index_date.
add_sex	Add sex from person (gender_concept_id: 8507 Male, 8532 Female).
add_table_intersect_count	Add count of records in table_name within each window.
add_table_intersect_date	Add date of first/last record in table_name within each window.
add_table_intersect_days	Add days from index_date to first/last record in table_name within each window.
add_table_intersect_field	Add a value column from the first/last record in table_name within each window.
add_table_intersect_flag	Add flag (1/0) for whether the person has a record in table_name within each window.
add_cohort_intersect_count	Add count of cohort entries within each window.
add_cohort_intersect_date	Add first/last cohort date within each window.
add_cohort_intersect_days	Add days to first/last cohort entry within each window.
add_cohort_intersect_field	Add a field value from the first/last cohort record within each window.
add_cohort_intersect_flag	Add flag (1/0) for overlap with cohort(s) in target_cohort_table within each window.
add_concept_intersect_count	Add count of concept occurrences within each window.
add_concept_intersect_date	Add first/last concept date within each window.
add_concept_intersect_days	Add days to first/last concept record within each window.
add_concept_intersect_field	Add a field value from the first/last concept record within each window.
add_concept_intersect_flag	Add flag for presence of concepts within each window.
available_estimates	Return DataFrame of estimate_name, estimate_description, estimate_type per variable_type.
end_date_column	Return the end date column name for an OMOP table, or None.
filter_cohort_id	Filter cohort to rows with cohort_definition_id in cohort_id. If cohort_id is None, return cohort unchanged.
filter_in_observation	Keep only rows where index_date falls within an observation period.
mock_patient_profiles	Create a minimal mock CDM for testing PatientProfiles (person, observation_period, cohort-like table).
source_concept_id_column	Return the source concept_id column for an OMOP table, or None.
standard_concept_id_column	Return the standard concept_id column for an OMOP table, or None.
start_date_column	Return the start date column name for an OMOP table (e.g. ‘condition_occurrence’ -> ‘condition_start_date’).
summarise_result	Summarise variables into a summarised_result-like structure.
variable_types	Return a DataFrame with variable_name and variable_type (integer, numeric, date, categorical, logical).

Cohort characteristics

Summaries, formatted tables, and analytic plots for cohort results.

SummarisedResult	Container for a summarised result (results table + settings table).
bind_summarised_results	Combine multiple SummarisedResult objects.
empty_summarised_result	Create an empty SummarisedResult.
estimate_type_choices	Return valid estimate_type values.
new_summarised_result	Construct a SummarisedResult from a DataFrame and optional settings.
result_columns	Return the standard summarised_result column names.
result_package_version	Analyze package versions used in a SummarisedResult.
transform_to_summarised_result	Convert an arbitrary DataFrame to a SummarisedResult.
summarise_characteristics	Summarise characteristics of cohorts in a cohort table.
summarise_cohort_attrition	Summarise attrition for cohorts.
summarise_cohort_count	Summarise counts for cohorts in a cohort table.
summarise_cohort_overlap	Summarise overlap between cohorts.
summarise_cohort_timing	Summarise timing between cohort entries for individuals in multiple cohorts.
summarise_large_scale_characteristics	Summarise large-scale characteristics for cohorts.
table_characteristics	Format a summarise_characteristics (or summarise_cohort_count) result into a table.
table_cohort_attrition	Format a summarise_cohort_attrition result into a table.
table_cohort_count	Format a summarise_cohort_count result into a table.
table_cohort_overlap	Format a summarise_cohort_overlap result into a table.
table_cohort_timing	Format a summarise_cohort_timing result into a table.
table_large_scale_characteristics	Format a summarise_large_scale_characteristics result into a table.
plot_characteristics	Plot characteristics from a summarise_characteristics result.
plot_cohort_attrition	Plot cohort attrition as a flow diagram.
plot_cohort_count	Plot cohort counts as a bar chart.
plot_cohort_overlap	Plot cohort overlap as a stacked bar chart.
plot_cohort_timing	Plot timing between cohort entries.
plot_compared_large_scale_characteristics	Compare large-scale characteristics across groups as a scatter plot.
plot_large_scale_characteristics	Plot large-scale characteristics as a scatter plot of concept frequencies.

Visualisation

visOmopResults-style table and chart helpers.

default_table_options	Default table formatting options (mirrors visOmopResults defaultTableOptions).
empty_plot	Return an empty Altair chart with a title message.
empty_table	Return an empty formatted table (DataFrame or GT object).
format_estimate_name	Combine estimate_name and estimate_value into new labels.
format_estimate_value	Format the estimate_value column by decimal places and number formatting.
format_header	Pivot result so header columns become column headers; estimate_value becomes cells.
format_min_cell_count	Replace suppressed count placeholders with min_cell_count from settings.
format_table	Format a DataFrame as a table (dataframe, HTML string, or great_tables GT object).
mock_summarised_result	Create a mock summarised_result DataFrame for examples (mirrors visOmopResults mockSummarisedResult).
plot_columns	Column names available for use in plot aesthetics for a summarised result.
plot_type	Supported plot output types.
table_columns	Column names that can be used in table header/group for a summarised result.
table_options	Return default table options for vis_omop_table / vis_table.
table_style	Pre-defined table style names.
table_type	Supported table output types.
tidy_summarised_result	Tidy a summarised result: keep long form; optionally pivot estimate_value by estimate_name.
vis_omop_table	Format a summarised_result DataFrame into a display table (visOmopTable port).
vis_table	Format a table (summarised_result-like DataFrame) into a display table.
bar_plot	Create a bar plot from a summarised_result DataFrame.
box_plot	Create a box plot from pre-computed summary statistics.
customise_text	Style text: replace underscores with spaces and convert to sentence case.
scatter_plot	Create a scatter/line plot from a summarised_result DataFrame.

Results

Result container and export.

Result

Lazy result: holds an Ibis table expression and optional metadata.

Exceptions

Errors raised by the package.

CDMConnectorError	Base exception for all CDMConnector errors.
CDMValidationError	Raised when CDM validation fails (e.g. missing tables, bad schema).
CohortError	Raised for cohort operation failures.
EunomiaError	Raised for Eunomia download/path failures.
SourceError	Raised for source/connection failures.
TableNotFoundError	Raised when a table is not found in the CDM.