Glossary of Terms



Allele, in biology, is the term given to the appropriate range of values for genes. In genetic algorithms, an allele is the value of the gene (or genes).

Allele Loss

Allele loss is the natural loss of traits in the gene pool over the generations of a run. Another term for allele loss is convergence. Severe allele loss results in a population incapable of solving the problem with the available gene pool.


Atoms are the mathematical or symbolic operators in the parse tree. Atoms are always internal (non-leaf) nodes as all atoms accept arguments.

Baldwin Effect

If the ability to learn increases the fitness, survival, of an individual, then its offspring will have a high probability of having that ability to learn.

Boltzmann selection

In Boltzmann selection, a method inspired by the technique of simulated annealing, selection pressure is slowly increased over evolutionary time to gradually focus the search. Given a fitness of f, Boltzmann selection assigns a new fitness, f0, according to a differentiable function.


Closure is based on the design of genetic operators to ensure the validity of functions generated by the genetic program. If closure is achieved, the functions will not cause errors, regardless of their arrangement, when their fitness is tested.


Convergence is a reduction in the diversity of genes available in the population due to fitness proportionate operators.


The solution to a GA is multiple sets of chromosomes (multiple individuals).

Disruptive selection

Individuals at both extremes of a range of phenotypes are favored over those in the middle. The evolutionary significance of disruptive selection lies in the possibility that the gene pool may become split into two distinct gene pools.  This may be a way in which new species are formed.


Diversity is the term used to describe the relative uniqueness of each individual in the population. This condition is considered favourable as the greater the variety of genes available to the genetic algorithm the greater the likelihood of the system identifying alternate solutions.

Expected value model

In this model you run experiments that demonstrate the mathematical idea "expected value" (sometimes called "expectation value"). There is a set of different possible outcomes, and each of these outcomes has a different value. You set up the probabilities of each of these outcomes, the model predicts the expected value based on your settings, and then you take samples from the population you created to see if on average you get the value that the model predicted.

Fitness scaling

Fitness scaling converts the raw fitness scores that are returned by the fitness function to values in a range that is suitable for the selection function.  The selection function uses the scaled fitness values to select the parents of the next generation.  The selection function assigns a higher probability of selection to individuals with higher scaled values.

Function Set

The set of functions (atoms) available to the genetic program. The operators in the function set are used to make up the internal nodes of the parse tree.


A generation is an iteration of the genetic algorithm. Conventionally, the initial random generation is known as generation zero.

Generational Equivalent

Generational equivalent is a term used in steady state techniques to identify a generation has occurred. Because the populations of two generation are combined during the selection and breeding cycle, a generational equivalent is said to occur when the number of genetic operations is equal to the population size of the genetic program.

Generational GP

Generational genetic programming is the process of producing distinct generations in each iteration of the genetic algorithm.


The genotype is the structure of the solution produced by the genetic program.


The solution to GA is a single set of chromosomes (one individual).


Inversion is a genetic operator sometimes used in genetic algorithms where two points in the string are chosen and the order of the genes between those two points are reversed (inverted).

Lamarckian Learning

An adaptive GA method to allow the individual’s genotype be altered so that processes learned can be passed to future generations genetically.

Leaf Node

A leaf node is a node in a parse tree with no children; a terminal.


If a GA is used to set parameters or discover optimal settings for a second GA, the first one is known as a meta-GA.


A micro-GA is a GA with a small population size (often 5) that has special reinitialization or mutation operators to increase diversity and prevent the natural convergence associated with small population sizes.

Non-leaf Node 

A non-leaf node is a node in a parse tree with one or more children; a function.

Non-Linear problems

Non-linear problems are those problems where the relationship between the inputs and the outputs of the problem are not clearly discernible.


Non-Deterministic Polynomial Complete is a term used in complexity theory to identify a particular class of problem. In an NP-complete problem, the relationship between the number of input parameters to the problem and the problem complexity is exponential. If an enumerative search strategy is adopted, this exponential increase in problem complexity results in an exponential increase in the time to solve the problem.


Off-line learning is a learning method where the systems learning is conducted prior to its practical application. When the system is actually used, no learning takes place.


On-line learning is a learning method where the system learns while it is being used. This learning method is more robust than off-line learning as the system can adapt to changes that may occur in the systems environment.

Parse Tree

A parse tree is the way the genetic programming paradigm represents the functions generated by a genetic program. A parse tree is similar to a binary decision tree and preorder traversal of the tree produces an S-expression that represents a potential solution in reverse polish notation.


Parsimony is the simplicity of the structure of a function.

Rank based selection

The individuals' selection probabilities are assigned according to the individuals' rank which is based on the fitness function values.  Each individual receives a rank from 1 to the population size.


Recombination is another name for crossover.


Reproduction is the copying of a chromosome into the next generation, i.e. budding.


 S-expressions are more frequently called LISP expressions and represent a function and its arguments. S-expressions are the most common representation of functions generated in genetic programming.

Soft brood selection

Soft brood selection is a technique where the parents produce a large number of offspring who compete amongst each other for becoming part of the next generation.  The offspring are evaluated against a ‘cheaper’ fitness function, often referred to as the culling function. The best of these offspring are moved to the next generation.

Stochastic universal sampling

The individuals are mapped to contiguous segments of a line, such that each individual's segment is equal in size to its fitness exactly as in roulette-wheel selection.  Here equally spaced pointers are placed over the line as many as there are individuals to be selected. Consider NPointer the number of individuals to be selected, then the distance between the pointers are 1/NPointer and the position of the first pointer is given by a randomly generated number in the range [0, 1/NPointer].

Structural Complexity 

Structural complexity is the term used to describe the number of nodes in the parse tree, i.e. Number of Terminals + Number of Atoms = Structural Complexity.


Sufficiency refers to a necessary variety of atoms and terminals available to the genetic program to solve the problem.


Terminals are the numeric values, (variables, constants and zero argument functions) in the parse tree and are always external (leaf) nodes in the tree. The terminals act as arguments for the operator (atom) that is their parent in the tree.

Terminal Set

The terminal set is comprised of the terminals available to the genetic program.


Tuning is the process of selecting the appropriate genetic operators and their respective parameters to suit a problem.


A wrapper is an interpretative function that evaluates the expression to be tested and returns a value within a suitable range for the simulation. tons of cool links hitch hikers guide to GAs cool demos