Optimizer_SGDM
The Optimizer_SGDM
class provides functions to optimize the parameters (weights and biases) of Neural Networks with various numbers of layers. This optimizer implements a Stochastic Gradient Descent with Momentum algorithm.
Attributes
- Graph
-
- Supertypes
-
trait Optimizertrait StoppingRuletrait MonitorLossclass Objecttrait Matchableclass AnyShow all
Members list
Value members
Concrete methods
Given training data x and y, fit the parameter/weight matrices bw and bias vectors bi. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the weights.
Given training data x and y, fit the parameter/weight matrices bw and bias vectors bi. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the weights.
Value parameters
- b
-
the array of parameters (weights & biases) between every two adjacent layers
- eta
-
the initial learning/convergence rate
- f
-
the array of activation function family for every two adjacent layers
- x
-
the m-by-n input matrix (training data consisting of m input vectors)
- y
-
the m-by-ny output matrix (training data consisting of m output vectors)
Attributes
Given training data x and y for a 2-layer, multi-output Neural Network, fit the parameter/weight matrix b. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the the parameter's weights.
Given training data x and y for a 2-layer, multi-output Neural Network, fit the parameter/weight matrix b. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the the parameter's weights.
Value parameters
- bb
-
the array of parameters (weights & biases) between every two adjacent layers
- eta
-
the initial learning/convergence rate
- ff
-
the array of activation function family for every two adjacent layers
- x
-
the m-by-n input matrix (training data consisting of m input vectors)
- y
-
the m-by-ny output matrix (training data consisting of m output vectors)
Attributes
Given training data x and y for a 3-layer Neural Network, fit the parameters (weights and biases) a & b. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the weights.
Given training data x and y for a 3-layer Neural Network, fit the parameters (weights and biases) a & b. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the weights.
Value parameters
- bb
-
the array of parameters (weights & biases) between every two adjacent layers
- eta
-
the initial learning/convergence rate
- ff
-
the array of activation function family for every two adjacent layers
- x
-
the m-by-n input matrix (training data consisting of m input vectors)
- y
-
the m-by-ny output matrix (training data consisting of m output vectors)
Attributes
Inherited methods
Given training data x and y for a Neural Network, fit the parameters b, returning the value of the lose function and the number of epochs. Find the best learning rate within the interval etaI.
Given training data x and y for a Neural Network, fit the parameters b, returning the value of the lose function and the number of epochs. Find the best learning rate within the interval etaI.
Value parameters
- b
-
the array of parameters (weights & biases) between every two adjacent layers
- etaI
-
the lower and upper bounds of learning/convergence rate
- f
-
the array of activation function family for every two adjacent layers
- opti
-
the array of activation function family for every two adjacent layers
- x
-
the m-by-n input matrix (training data consisting of m input vectors)
- y
-
the m-by-ny output matrix (training data consisting of m output vectors)
Attributes
- Inherited from:
- Optimizer
Collect the next value for the loss function.
Collect the next value for the loss function.
Value parameters
- loss
-
the value of the loss function
Attributes
- Inherited from:
- MonitorLoss
Freeze layer flayer during back-propogation (should only impact the optimize method in the classes extending this trait). FIX: make abstract (remove ???) and implement in extending classes
Freeze layer flayer during back-propogation (should only impact the optimize method in the classes extending this trait). FIX: make abstract (remove ???) and implement in extending classes
Value parameters
- flayer
-
the layer to freeze, e.g., 1 => first hidden layer
Attributes
- Inherited from:
- Optimizer
Return a permutation vector generator that will provide a random permutation of index positions for each call permGen.igen (e.g., used to select random batches).
Return a permutation vector generator that will provide a random permutation of index positions for each call permGen.igen (e.g., used to select random batches).
Value parameters
- m
-
the number of data instances
- rando
-
whether to use a random or fixed random number stream
Attributes
- Inherited from:
- Optimizer
Plot the loss function versus the epoch/major iterations.
Plot the loss function versus the epoch/major iterations.
Value parameters
- optName
-
the name of optimization algorithm (alt. name of network)
Attributes
- Inherited from:
- MonitorLoss
Stop when too many steps have the cost measure (e.g., sse) increasing. Signal a stopping condition by returning the best parameter vector, else null.
Stop when too many steps have the cost measure (e.g., sse) increasing. Signal a stopping condition by returning the best parameter vector, else null.
Value parameters
- b
-
the current parameter value (weights and biases)
- sse
-
the current value of cost measure (e.g., sum of squared errors)
Attributes
- Inherited from:
- StoppingRule
Stop when too many steps have the cost measure (e.g., sse) increasing. Signal a stopping condition by returning the best parameter vector, else null.
Stop when too many steps have the cost measure (e.g., sse) increasing. Signal a stopping condition by returning the best parameter vector, else null.
Value parameters
- b
-
the current value of the parameter vector
- sse
-
the current value of cost measure (e.g., sum of squared errors)
Attributes
- Inherited from:
- StoppingRule
Inherited fields
Attributes
- Inherited from:
- StoppingRule