This function will connect to the database, generate the sql scripts, and run the data quality checks against the database. By default, results will be written to a json file as well as a database table.
executeDqChecks(
connectionDetails,
cdmDatabaseSchema,
resultsDatabaseSchema,
vocabDatabaseSchema = cdmDatabaseSchema,
cdmSourceName,
numThreads = 1,
sqlOnly = FALSE,
sqlOnlyUnionCount = 1,
sqlOnlyIncrementalInsert = FALSE,
outputFolder,
outputFile = "",
verboseMode = FALSE,
writeToTable = TRUE,
writeTableName = "dqdashboard_results",
writeToCsv = FALSE,
csvFile = "",
checkLevels = c("TABLE", "FIELD", "CONCEPT"),
checkNames = c(),
checkSeverity = c("fatal", "convention", "characterization"),
cohortDefinitionId = c(),
cohortDatabaseSchema = resultsDatabaseSchema,
cohortTableName = "cohort",
tablesToExclude = c("CONCEPT", "VOCABULARY", "CONCEPT_ANCESTOR",
"CONCEPT_RELATIONSHIP", "CONCEPT_CLASS", "CONCEPT_SYNONYM", "RELATIONSHIP", "DOMAIN"),
cdmVersion = "5.3",
tableCheckThresholdLoc = "default",
fieldCheckThresholdLoc = "default",
conceptCheckThresholdLoc = "default"
)
A connectionDetails object for connecting to the CDM database
The fully qualified database name of the CDM schema
The fully qualified database name of the results schema
The fully qualified database name of the vocabulary schema (default is to set it as the cdmDatabaseSchema)
The name of the CDM data source
The number of concurrent threads to use to execute the queries
Should the SQLs be executed (FALSE) or just returned (TRUE)?
(OPTIONAL) In sqlOnlyIncrementalInsert mode, how many SQL commands to union in each query to insert check results into results table (can speed processing when queries done in parallel). Default is 1.
(OPTIONAL) In sqlOnly mode, boolean to determine whether to generate SQL queries that insert check results and associated metadata into results table. Default is FALSE (for backwards compatibility to <= v2.2.0)
The folder to output logs, SQL files, and JSON results file to
(OPTIONAL) File to write results JSON object
Boolean to determine if the console will show all execution steps. Default is FALSE
Boolean to indicate if the check results will be written to the dqdashboard_results table in the resultsDatabaseSchema. Default is TRUE
The name of the results table. Defaults to `dqdashboard_results`. Used when sqlOnly or writeToTable is True.
Boolean to indicate if the check results will be written to a csv file. Default is FALSE
(OPTIONAL) CSV file to write results
Choose which DQ check levels to execute. Default is all 3 (TABLE, FIELD, CONCEPT)
(OPTIONAL) Choose which check names to execute. Names can be found in inst/csv/OMOP_CDM_v[cdmVersion]_Check_Descriptions.csv. Note that "cdmTable", "cdmField" and "measureValueCompleteness" are always executed.
Choose which DQ check severity levels to execute. Default is all 3 (fatal, convention, characterization)
The cohort definition id for the cohort you wish to run the DQD on. The package assumes a standard OHDSI cohort table with the fields cohort_definition_id and subject_id.
The schema where the cohort table is located.
The name of the cohort table. Defaults to `cohort`.
(OPTIONAL) Choose which CDM tables to exclude from the execution.
The CDM version to target for the data source. Options are "5.2", "5.3", or "5.4". By default, "5.3" is used.
The location of the threshold file for evaluating the table checks. If not specified the default thresholds will be applied.
The location of the threshold file for evaluating the field checks. If not specified the default thresholds will be applied.
The location of the threshold file for evaluating the concept checks. If not specified the default thresholds will be applied.
If sqlOnly = FALSE, a list object of results