Tuesday, March 14, 2017

Evolutionary Programming cheat sheet (part 1)

Writing up some thoughts for my open-notes test in Evolutionary Programming on Thursday.

Genetic Algorithms

Also known as Binary GA, to indicate that the chromosomes are binary sequences. Real-valued GA is obviously just another variant where the binary sequence is interpreted as an approximation of a real number.

Closest to the natural model of Genetic evolution. Basic steps are:

Initialize population P(t)
Evaluate P(t)
While (not TerminationCodition(P(t)))
   Parents(t) = Select(P(t))
   Offspring(t) = Cross(P(t))
   P(t+1) = Evaluate(Offspring(t))
   t = t + 1
   
So, basically, you loop until you're done, with each generation being selected for child mortality and breeding fitness. Pretty much the way nature works. Not a whole lot more to say on this one.

On a purely theoretical note, Real-coded GA's are somewhat more capable of capturing a phenotype rather than a genotype. That is, they get at the behavior you're really after rather than the coding of the behavior. This distinction is not a big deal in practice, since converting from a binary code to the actual trait is usually straightforward.

Parent selection: several options, the most popular are Proportional Selection (each parent is selected proportional to their fitness score), Linear Rank (population is ranked and selected by rank within population), and Tournament (head-head evaluation of competing parents).

The pros and cons of each of these are fairly obvious and not terribly interesting.

The cross step includes both crossing and mutation. Crossover can be single-point (one cut and the genetic strand is swapped at only the cut point), multiple cut, or uniform (each gene can come from either parent). 

Midpoint crossover: given two parents X, Y, Offspring = (X+Y)/2. This is stupid, by the way.
BLX-α crossover: Δ = α(X-Y), Offspring = random(X+Δ, Y+Δ). Somewhat less stupid. BLX-0 is also known as midpoint crossover.

Generation Gap: The proportion of the population that is replaced each cycle.
Elitism: Fraction of population that is guaranteed to survive to preserve the best solutions found so far.

Shared fitness GA

Not much different than GA, except in the fitness step. Here, the fitness is diluted by the number of close neighbors. That is, if there are 5 neighbors within the neighborhood threshold, then the fitness will be divided by 5. In the early goings of a GA, this can help disperse clustering at local minima, since the fitness of such is diluted. Of course, by the end, one wants to be looking at the real fitness of the solution.

Variable Crossover GA (VGA)

I'll confess I totally misunderstood where this was going when I first read it. It's a fairly powerful technique when your solution space is not easily defined by a fixed chromosome (e,g., machine learning problems where the chromosome may have to encode an arbitrarily large decision tree).

Given two parents X, Y of arbitrary length. Pick a random cut from X (any portion of the chromosome). Find the corresponding intervals of Y that make semantic sense. That is, if the chunk of X can fit into several different intervals of Y and still be interpreted properly, identify all such chunks. Make the swap.

The point is you might be swapping strands of much different lengths. Suppose (as in the case of my term project) that a chromosome identifies an attribute and a range of values for that attribute. There are many such attributes on a chromosome. Now, split the chromosome for X. This may include 1 or more attributes and may contain all or none of the values for that attribute. Now, figure out how this fragment fits into Y and still makes sense. Obviously, you need the attribute to be consistent with the values. You also need to be swapping out coherent criteria, that is, you can replace one attribute criteria with several, but you can't apply the values for one attribute in Y to a completely different attribute in X. So, you're resulting chromosomes could be of any length based on where you made the cut on X and how many ways there were to fit that cut into Y.





No comments:

Post a Comment