TrEncoderLayer

scalation.modeling.forecasting.neuralforecasting.TrEncoderLayer
class TrEncoderLayer(n_var: Int, n_mod: Int, heads: Int, n_v: Int, n_z: Int, f: AFF, p_drop: Double, norm_eps: Double, norm_first: Boolean) extends Attention

The TrEncoderLayer class consists of a Multi-Head Self-Attention and a Feed-Forward Neural Network (FFNN) sub-layers.

Value parameters

f

the activation function family (used by alinear1)

heads

the number of attention heads

n_mod

the size of the output (dimensionality of the model, d_model)

n_v

the size of the value vectors

n_var

the size of the input vector x_t (number of variables)

n_z

the size of the hidden layer in the Feed-Forward Neural Network

norm_eps

a small values used in normalization to avoid divide by zero

norm_first

whether layer normalization should be done first (see apply method)

p_drop

the probability of setting an element to zero in a dropout layer

Attributes

See also

pytorch.org/docs/stable/generated/torch.nn.TransformerEncoderLayer.html#torch.nn.TransformerEncoderLayer

Graph
Supertypes
trait Attention
class Object
trait Matchable
class Any

Members list

Value members

Concrete methods

def apply(x: MatrixD): MatrixD

Forward pass: Compute this encoder layer's result z by using Multi-Head Self-Attention followed by a Feed-Forward Neural Network.

Forward pass: Compute this encoder layer's result z by using Multi-Head Self-Attention followed by a Feed-Forward Neural Network.

Value parameters

x

the input matrix

Attributes

Compute the Feed-forward Neural Network result.

Compute the Feed-forward Neural Network result.

Value parameters

x

the input matrix

Attributes

Compute the Multi-Head Self-Attention result.

Compute the Multi-Head Self-Attention result.

Value parameters

x

the input matrix

Attributes

Inherited methods

Compute a Self-Attention Weight Matrix from the given query (Q), key (K) and value (V).

Compute a Self-Attention Weight Matrix from the given query (Q), key (K) and value (V).

Value parameters

k

the key matrix K

q

the query matrix Q (q_t over all time)

v

the value matrix V

Attributes

Inherited from:
Attention
def attentionMH(q: MatrixD, k: MatrixD, v: MatrixD, w_q: TensorD, w_k: TensorD, w_v: TensorD, w_o: MatrixD): MatrixD

Compute a Multi-Head, Self-Attention Weight Matrix by taking attention for each head and concatenating them; finally multiplying by the overall weight matrix w_o. The operator ++^ concatenates matrices column-wise.

Compute a Multi-Head, Self-Attention Weight Matrix by taking attention for each head and concatenating them; finally multiplying by the overall weight matrix w_o. The operator ++^ concatenates matrices column-wise.

Value parameters

k

the key matrix K

q

the query matrix Q (q_t over all time)

v

the value matrix V

w_o

the overall weight matrix to be applied to concatenated attention

w_q

the weight tensor for query Q (w_q(i) matrix for i-th head)

w_v

the weight tensor for value V (w_v(i) matrix for i-th head)

Attributes

Inherited from:
Attention
def context(q_t: VectorD, k: MatrixD, v: MatrixD): VectorD

Compute a Context Vector from the given query at time t (q_t), key (K) and value (V).

Compute a Context Vector from the given query at time t (q_t), key (K) and value (V).

Value parameters

k

the key matrix K

q_t

the query vector at time t (based on input vector x_t)

v

the value matrix V

Attributes

Inherited from:
Attention

Compute the Query, Key, Value matrices from the given input and weight matrices.

Compute the Query, Key, Value matrices from the given input and weight matrices.

Value parameters

w_q

the weight matrix for query Q

w_v

the weight matrix for value V

x

the input matrix

Attributes

Inherited from:
Attention

Concrete fields

Inherited fields

val n_k: Int

Attributes

Inherited from:
Attention
val n_val: Int

Attributes

Inherited from:
Attention

Attributes

Inherited from:
Attention

Attributes

Inherited from:
Attention