Generate Alignments — generateRawAlignments • ARTEMIS

Generate processed alignments of treatment regimens using temporal Needleman–Wunsch or Smith–Waterman algorithms. The input regimens are aligned against patient drug records from stringDF data.frame.

Usage

generateRawAlignments(
  stringDF,
  regimens,
  g = 0.4,
  Tfac = 0.5,
  s = NULL,
  verbose = 0,
  mem = -1,
  method = "PropDiff"
)

Arguments

stringDF: A dataframe that contains patient IDs and seq columns. Each seq should be a valid encoded drug record. Check example below.
regimens: A regimen dataframe, containing required regimen shortStrings for testing
g: A gap penalty supplied to the temporal Needleman-Wunsch/Smith–Waterman algorithm
Tfac: The time penalty factor. All time penalties are calculated as a percentage of Tfac
s: A substituion matrix, either user-defined or derived from defaultSmatrix. Will be auto-generated if left blank.
verbose: A variable indicating how verbose the python script should be in reporting results Verbose = 0 : Print nothing Verbose = 1 : Print seqs and scores Verbose = 2 : Report seqs, scores, H and traceMat
mem: A number defining how many sequences to hold in memory during local alignment. Mem = -1 : Script will define memory length according to floor(len(regimen)/len(drugRec)) Mem = 0 : Script will return exactly 1 alignment Mem = 1 : Script will return 1 alignment and all alignments with the same score Mem = X : Script will return X alignments and all alignments with equivalent score as the Xth alignment
method: A character string indicating which loss function method to utilise. Please pick one of PropDiff - Proportional difference of Tx and Ty AbsDiff - Absolute difference of Tx and Ty Quadratic - Absolute difference of Tx and Ty to the power 2 PropQuadratic - Absolute difference of Tx and Ty to the power 2, divided by the max of Tx and Ty LogCosh - The natural logarithm of the Cosh of the absolute difference of Tx and Ty
writeOut: A variable indicating whether to save the set of drug records
outputName: The name for a given written output

Value

A data.frame containing regimen alignment results mapped onto patient records.

Examples

stringDF <- data.frame(
  person_id = c("P1", "P2"),
  seq = c("7.cisplatin;0.etoposide;1.etoposide;1.etoposide;",
          "0.paclitaxel;1.carboplatin;")
)