Creates a learning curve in parallel, which can be plotted using
the plotLearningCurve()
function. Currently this functionality is
only supported by Lasso Logistic Regression.
createLearningCurvePar( population, plpData, modelSettings, testSplit = "stratified", testFraction = 0.25, trainFractions = c(0.25, 0.5, 0.75), trainEvents = NULL, splitSeed = NULL, nfold = 3, indexes = NULL, verbosity = "TRACE", minCovariateFraction = 0.001, normalizeData = T, saveDirectory = getwd(), savePlpData = F, savePlpResult = F, savePlpPlots = F, saveEvaluation = F, timeStamp = FALSE, analysisId = "lc-", cores = NULL )
population | The population created using |
---|---|
plpData | An object of type |
modelSettings | An object of class
|
testSplit | Specifies the type of evaluation used. Can be either
|
testFraction | The fraction of the data, which will be used as the testing set in the patient split evaluation. |
trainFractions | A list of training fractions to create models for.
Note, providing |
trainEvents | Events have shown to be determinant of model performance.
Therefore, it is recommended to provide
|
splitSeed | The seed used to split the testing and training set when using a 'person' type split |
nfold | The number of folds used in the cross validation (default =
|
indexes | A dataframe containing a rowId and index column where the
index value of -1 means in the test set, and positive integer represents
the cross validation fold (default is |
verbosity | Sets the level of the verbosity. If the log level is at or higher in priority than the logger threshold, a message will print. The levels are:
|
minCovariateFraction | Minimum covariate prevalence in population to avoid removal during preprocssing. |
normalizeData | Whether to normalise the data |
saveDirectory | Location to save log and results |
savePlpData | Whether to save the plpData |
savePlpResult | Whether to save the plpResult |
savePlpPlots | Whether to save the plp plots |
saveEvaluation | Whether to save the plp performance csv files |
timeStamp | Include a timestamp in the log |
analysisId | The analysis unique identifier |
cores | The number of cores to use |
A learning curve object containing the various performance measures
obtained by the model for each training set fraction. It can be plotted
using plotLearningCurve
.
if (FALSE) { # define model modelSettings = setLassoLogisticRegression() # register parallel backend registerParallelBackend() # create learning curve learningCurve <- createLearningCurvePar(population, plpData, modelSettings) # plot learning curve plotLearningCurve(learningCurve) }