The GP.Lab Genetic Programming Kernel (continued)

This second post on our custom Genetic Programming implementation covers selection and genetic operators implemented by the kernel.

Selection Strategies

The GP kernel implements a generational algorithm and currently supports the three most popular selection modes for choosing parents to produce offspring for the next generation:
  • Tournament Selection, i.e. for a tournament size T randomly pick T programs from the parent population and return the winner of the tournament - the program with the best fitness.
    The selection method of choice in the GA community as it allows to fine-tune the selection pressure via the tournament size parameter to avoid premature loss of diversity (i.e. fittest programs dominating the population with their copies).

  • Roulette Wheel, a flavor of Fitness Proportionate Selection where the chance of an individual to become a parent is proportional to its fitness, i.e. parents with the highest absolute fitness in the population have a higher chance of producing offspring for the next generation.
     
  • Rank Selection, a Fitness Proportionate Selection scheme where the rank of an individual in the population is proportional to the individual's chance of being selected as a parent. This emphasizes small variations in fitness even in a mostly homogeneous population giving it a clear advantage over Roulette Wheel selection (based on absolute fitness) in this scenario.
This GP implementation uses standardized fitness to represent the fitness of individual programs thus all selection methods aim to minimize the raw fitness value (i.e. smaller is better).

The selection strategy optionally employs elitism: The best individuals of the current generation are copied to the next generation to guarantee good programs are not lost. The elitism parameter is limited to 10% of the population size in order to prevent fittest programs from dominating the gene pool.

Genetic Operators

In Genetic Programming offspring for a new generation are created by applying crossover and mutation to programs in the current generation. This implementation provides a variety of different operators in order to allow the simulated evolution to effectively search the space of possible programs for good solutions.

The hoist and shrink mutation operators have the additional benefit of reducing program size as the average size in the population tends to grow towards the maximum allowed size during virtually any run of a GP system.

Mutation is performed by randomly picking one of the 4 mutation operators currently implemented:
  • Subtree Mutation, a randomly selected subtree in the parent is replaced with a newly created, random subtree. The subtree to be replaced is randomly selected with all nodes having an equal chance of being selected.
  • Point Mutation, a node in the tree is randomly selected and mutated. This mutation variant ensures that the mutated node arity remains unchanged (i.e. functions are replaced with functions of identical parameter count, leaf nodes are replaced with leafs) and all other tree nodes in the created offspring are identical to the parent.
  • Shrink Mutation, a random subtree is replaced with a leaf. In this implementation the subtree is replaced with a randomly chosen constant.
  • Hoist Mutation, replaces the entire program with a randomly selected subtree (the subtree becomes the new root). The subtree is required to have a depth of at least 1 to avoid invalid programs - i.e. constant expressions - from being created.
At this point the system implements a single crossover operator, Single-point Subtree Crossover, creating offspring by swapping subtrees of two parent programs and placing one of the resulting offspring in the next generation.

Points for the crossover operation are randomly selected by favoring nodes close to the root: Direct successor nodes of the root have a 50% chance of being selected, the next level of nodes has a 25% chance of being selected and so on.

The next post will cover evaluation of individual programs and calculating the fitness of specific solutions.

Share and Enjoy!

Comments

Popular posts from this blog

The GP.Lab Genetic Programming Kernel

About GP.Lab