A transformer model
Usage
setTransformer(
numBlocks = 3,
dimToken = 192,
dimOut = 1,
numHeads = 8,
attDropout = 0.2,
ffnDropout = 0.1,
dimHidden = 256,
dimHiddenRatio = NULL,
temporal = FALSE,
temporalSettings = list(positionalEncoding = list(name = "SinusoidalPE", dropout =
0.1), maxSequenceLength = 256, truncation = "tail", timeTokens = TRUE),
estimatorSettings = setEstimator(weightDecay = 1e-06, batchSize = 1024, epochs = 10,
seed = NULL),
hyperParamSearch = "random",
randomSample = 1,
randomSampleSeed = NULL
)
Arguments
- numBlocks
number of transformer blocks
- dimToken
dimension of each token (embedding size)
- dimOut
dimension of output, usually 1 for binary problems
- numHeads
number of attention heads
- attDropout
dropout to use on attentions
- ffnDropout
dropout to use in feedforward block
dimension of the feedworward block
dimension of the feedforward block as a ratio of dimToken (embedding size)
- temporal
Whether to use a transformer with temporal data
- temporalSettings
settings for the temporal transformer. Which include - `positionalEncoding`: Positional encoding to use, either a character or a list with name and settings, default 'SinusoidalPE' with dropout 0.1 - `maxSequenceLength`: Maximum sequence length, sequences longer than This will be truncated and/or padded to this length either a number or 'max' for the Maximum - `truncation`: Truncation method, only 'tail' is supported - `timeTokens`: Whether to use time tokens, default TRUE
- estimatorSettings
created with `setEstimator`
- hyperParamSearch
what kind of hyperparameter search to do, default 'random'
- randomSample
How many samples to use in hyperparameter search if random
- randomSampleSeed
Random seed to sample hyperparameter combinations