scalation.modeling.forecasting.neuralforecasting.TrEncoderLayer
class TrEncoderLayer (n_var : Int , n_mod : Int , heads : Int , n_v : Int , n_z : Int , f : AFF , p_drop : Double , norm_eps : Double , norm_first : Boolean ) extends Attention
The TrEncoderLayer
class consists of a Multi-Head Self-Attention and a Feed-Forward Neural Network (FFNN) sub-layers.
Value parameters
f
the activation function family (used by alinear1)
heads
the number of attention heads
n_mod
the size of the output (dimensionality of the model, d_model)
n_v
the size of the value vectors
n_var
the size of the input vector x_t (number of variables)
n_z
the size of the hidden layer in the Feed-Forward Neural Network
norm_eps
a small values used in normalization to avoid divide by zero
norm_first
whether layer normalization should be done first (see apply method)
p_drop
the probability of setting an element to zero in a dropout layer
Attributes
See also
pytorch.org/docs/stable/generated/torch.nn.TransformerEncoderLayer.html#torch.nn.TransformerEncoderLayer
Graph
Reset zoom Hide graph Show graph
Supertypes
class Object
trait Matchable
class Any
Members list
Forward pass: Compute this encoder layer's result z by using Multi-Head Self-Attention followed by a Feed-Forward Neural Network.
Forward pass: Compute this encoder layer's result z by using Multi-Head Self-Attention followed by a Feed-Forward Neural Network.
Value parameters
x
the input matrix
Attributes
Compute the Feed-forward Neural Network result.
Compute the Feed-forward Neural Network result.
Value parameters
x
the input matrix
Attributes
Compute the Multi-Head Self-Attention result.
Compute the Multi-Head Self-Attention result.
Value parameters
x
the input matrix
Attributes
Compute a Self-Attention Weight Matrix from the given query (Q), key (K) and value (V).
Compute a Self-Attention Weight Matrix from the given query (Q), key (K) and value (V).
Value parameters
k
the key matrix K
q
the query matrix Q (q_t over all time)
v
the value matrix V
Attributes
Inherited from:
Attention
Compute a Multi-Head, Self-Attention Weight Matrix by taking attention for each head and concatenating them; finally multiplying by the overall weight matrix w_o. The operator ++^ concatenates matrices column-wise.
Compute a Multi-Head, Self-Attention Weight Matrix by taking attention for each head and concatenating them; finally multiplying by the overall weight matrix w_o. The operator ++^ concatenates matrices column-wise.
Value parameters
k
the key matrix K
q
the query matrix Q (q_t over all time)
v
the value matrix V
w_o
the overall weight matrix to be applied to concatenated attention
w_q
the weight tensor for query Q (w_q(i) matrix for i-th head)
w_v
the weight tensor for value V (w_v(i) matrix for i-th head)
Attributes
Inherited from:
Attention
Compute a Context Vector from the given query at time t (q_t), key (K) and value (V).
Compute a Context Vector from the given query at time t (q_t), key (K) and value (V).
Value parameters
k
the key matrix K
q_t
the query vector at time t (based on input vector x_t)
v
the value matrix V
Attributes
Inherited from:
Attention
Compute the Query, Key, Value matrices from the given input and weight matrices.
Compute the Query, Key, Value matrices from the given input and weight matrices.
Value parameters
w_q
the weight matrix for query Q
w_v
the weight matrix for value V
x
the input matrix
Attributes
Inherited from:
Attention