Generate a cohort set from one or more concept sets (named list of concept IDs).
Each concept set becomes one cohort; each row represents the time during which the concept was observed for that subject. Concepts are looked up in the CDM vocabulary and domain tables (condition_occurrence, drug_exposure, etc.). Concepts not in the vocabulary or in missing domain tables are silently skipped. If a domain has no end date (e.g. procedure, observation), start date is used as end date.
Parameters
Name
Type
Description
Default
cdm
Cdm reference (from cdm_from_con with observation_period and concept table).
required
concept_set
dict[str, list[int] | list[dict]]
Named concept sets: name -> list of concept_id (int) or list of concept specs (dict). Each name becomes one cohort. Concept specs are dicts with: - “concept_id” (int, required) - “include_descendants” (bool, optional): if True, expand via concept_ancestor (requires concept_ancestor table). Default False. - “is_excluded” (bool, optional): if True, exclude this concept from the set. Default False. Simple form: {“cohort_a”: [192671, 123]} uses no descendants and not excluded.
required
name
str
Name of the cohort table (lowercase, letters/numbers/underscores). Default “cohort”.
'cohort'
limit
str
“first” (default) or “all”: include only first occurrence per subject per cohort, or all.
'first'
required_observation
tuple[int, int]
(prior_days, future_days) required observation around the event. Default (0, 0).
(0, 0)
end
str or int
How to set cohort_end_date: “observation_period_end_date” (default), “event_end_date”, or a fixed number of days from cohort_start_date.
'observation_period_end_date'
subset_cohort
str
If set, only persons in this cohort table are included.
None
subset_cohort_id
int or list[int]
If set with subset_cohort, only these cohort_definition_id(s) from the subset cohort.
None
overwrite
bool
If True, overwrite existing cohort tables. Default True.
True
Returns
Name
Type
Description
Cdm
CDM with the new cohort table and cohort_set / cohort_attrition populated.
Raises
Name
Type
Description
CohortError
If CDM has no database source, name is invalid, or required tables are missing.