Tidy covariate data

tidyCovariateData(covariateData, covariates, covariateRef, populationSize,
minFraction = 0.001, normalize = TRUE, removeRedundancy = TRUE)

Arguments

covariateData An object as generated using the getDbCovariateData function. If provided, the covariates, covariateRef, and populationSize arguments will be ignored. An ffdf object with the covariate values in spare format. Will be ignored if covariateData is provided. An ffdf object with the covariate definitions. Will be ignored if covariateData is provided. Only needed when removeRedundancy = TRUE. An integer specifying the total number of unique cohort entries (rowIds). Will be ignored if covariateData is provided. Only needed when removeRedundancy = TRUE. Minimum fraction of the population that should have a non-zero value for a covariate for that covariate to be kept. Set to 0 to don't filter on frequency. Normalize the coviariates? (dividing by the max) Should redundant covariates be removed?

Details

Normalize covariate values by dividing by the max and/or remove redundant covariates and/or remove infrequent covariates. For temporal covariates, redundancy is evaluated per time ID.