Genetics Algorithms in Ruby :: ai4r
Introduction to Genetics Algorithms in Ruby

The GeneticAlgorithm module implements the GeneticSearch and Chromosome classes. The GeneticSearch is a generic class, and can be used to solved any kind of problems. The GeneticSearch class performs a stochastic search of the solution of a given problem. It uses the following pseudocode:

  1. Choose initial population
  2. Evaluate the fitness of each individual in the population
  3. Repeat as many times as generations we allow
    1. Select randomly best-ranking individuals to reproduce
    2. Breed new generation through crossover and mutation (genetic operations) and give birth to offspring
    3. Evaluate the individual fitnesses of the offspring
    4. Replace worst ranked part of population with offspring

The Chromosome is "problem specific". Ai4r built-in Chromosomeclass was designed to model the Travelling salesman problem. You have to provide a matrix with the cost of traveling from one point to another (array of arrays of float values). If you want to solve other type of problem, you will have to modify the Chromosome class, by overwriting its fitness, reproduce, and mutate functions, to model you specific problem.

How to use it
The European Rock Tour Problem (Also known as the Travelling salesman problem)

An ageing rock band was planning its (hopefully) last european tour. They were planning to visite 15 european cities: Barcelona, Berlin, Brussels, Dublin, Hamburg, Kiev, London, Madrid, Milan, Moscow, Munich, Paris, Rome, Vienna, and Warsaw.

European Tour

They start planning the trip, when they realize that they could save a lot of money, if they ordered the cities to minimize the traveling cost. So they decided to try all possible combinations. The sat in front of the computer, visited they favorite traveling site, and started typing. 53 hours and several liters of coffee later, they realized it was a little bit more complicated than what they expected. They called their drummer (who was on vacations) and explained the problem to him. Fortunately, their drummer had a Master in Computer Science degree.

Drummer – Boys, if you continue, you will have to try 1,307,674,368,000 combinations. You are in front of a NP Problem.

Band member #1 – Oh man! So it is going to take us all day!

Band member #2 – And maybe more, 'cause this internet connection sucks...

Drummer – err... yes, it would take a while. But don't worry, I am sure we can get to a good solution using stochastic search algorithms applied to this problem..

Band – (Silence)

Drummer – .. that is, we are going to move from solution to solution in the space of candidate solutions, using techniques similar to what nature use for evolution, these are known as genetic algorithms.

Band – (Silence)

Drummer - ... What I mean is, we will pick some of them randomly, leave the ugly ones behind, and mate with the good looking ones...

Band – YEAH! THAT'S THE MAN! LET'S DO IT!

I forgot to tell another restriction of this problem: This band is really bad (What did you expect? Their drummer is a computer geek!) so once they visited a city, they cannot go back there.

Results of using Genetic Algorithms to the The European Rock Tour Problem (or Travelling salesman problem)

The cost of 3 randomly selected tours:

3 tours obtained with an initial population of 800, and after 100 generations:

Best tour result using Genetic Algorithms in ruby

The GeneticSearch class is an generic class to try to solve any kind of problem using genetic algorithms. If you want to model another type of problem, you will have to modify the Chromosome class, defining its fitness, mutate, and reproduce functions.

Implementation of Chromosome class for the Travelling salesman problem

Although the GeneticSearch class is an generic class to try to solve any kind of problem using genetic algorithms, the Chromosome class is problem specific.

Data representation

Each chromosome must represent a posible solution for the problem. This class conatins an array with the list of visited nodes (cities of the tour). The size of the tour is obtained automatically from the traveling costs matrix. You have to assign the costs matrix BEFORE you run the genetic search. The following costs matrix could be used to solve the problem with only 3 cities:

Fitness function

The fitness function quantifies the optimality of a solution (that is, a chromosome) in a genetic algorithm so that that particular chromosome may be ranked against all the other chromosomes. Optimal chromosomes, or at least chromosomes which are more optimal, are allowed to breed and mix their datasets by any of several techniques, producing a new generation that will (hopefully) be even better.

The fitness function will return the complete tour cost represented by the chromosome, multiplied by -1. For example:

-9 ]]>

That is: From 0 to 2 costs 5. From 2 to 1 costs 4. Total cost is 9.

Reproduce function

Reproduction is used to vary the programming of a chromosome or chromosomes from one generation to the next. There are several ways to combine two chromosomes: One-point crossover, Two-point crossover, "Cut and splice", edge recombination, and more. The method is usually dependant of the problem domain. In this case, we have implemented edge recombination, wich is the most used reproduction algorithm for the Travelling salesman problem. The edge recombination operator (ERO) is an operator that creates a path that is similar to a set of existing paths (parents) by looking at the edges rather than the vertices.

Edge recombination

The previous image was taken from the wikipedia, so hail to the author: Koala man (not me).

Mutation function

Mutation funtion will be called fro every member of the population, on each generations. But you do not want to mutate your chromosomes every time, specialy if the are very fit. This is how it is currently implemented: With a probabilty of changing inversely proportional to its fitness, we swap 2 consecutive random nodes.

Seed function

Initializes an individual solution (chromosome) for the initial population. The built in seed function generates a chromosome randomly, but you can use some problem domain knowledge, to generate better initial solutions (although this not always deliver better results, it improves convergency times).

0 do index = rand(available.length) seed << available.delete_at(index) end return Chromosome.new(seed) end ]]>
Implementation of GeneticSearch

The GeneticSearch class is an generic class to try to solve any kind of problem using genetic algorithms. If you want to model another type of problem, you will have to modify the Chromosome class, defining its fitness, mutate, and reproduce functions.

Initialize the search

You have to provide two parameters during instantiation: The initial population size, and the how many generations produce. Large numbers will usually converge to better results, while small numbers will have better performance.

Run method

Once you initialize an instance of GeneticSearch class, you can perform the search executing the run method. This method will:

  1. Choose initial population
  2. Evaluate the fitness of each individual in the population
  3. Repeat as many times as generations we allow
    1. Select randomly the best-ranking individuals to reproduce
    2. Breed new generation through crossover and mutation (genetic operations) and give birth to offspring
    3. Evaluate the individual fitnesses of the offspring
    4. Replace worst ranked part of population with offspring
Selection

Selection is the stage of a genetic algorithm in which individual genomes are chosen from a population for later breeding. There are several generic selection algorithms, such as tournament selection and roulette wheel selection. We implemented the latest.

  1. The fitness function is evaluated for each individual, providing fitness values
  2. The population is sorted by descending fitness values.
  3. The fitness values ar then normalized. (Highest fitness gets 1, lowest fitness gets 0). The normalized value is stored in the "normalized_fitness" attribute of the chromosomes.
  4. A random number R is chosen. R is between 0 and the accumulated normalized value (all the normalized fitness values added togheter).
  5. The selected individual is the first one whose accumulated normalized value (its is normalized value plus the normalized values of the chromosomes prior it) greater than R.
  6. We repeat steps 4 and 5, 2/3 times the population size.

Edge recombination

The previous image was taken from the wikipedia, so hail to the author: Simon Hatthon.

Reproduction

The reproduction function combines each pair of selected chromosomes using the method Chromosome.reproduce.

The reproduction will also call the Chromosome.mutate method with each member of the population. You should implement Chromosome.mutate to only change (mutate) randomly. E.g. You could effectivly change the chromosome only if:

How to run the example

You can run the example with "ruby genetic_algorithm_example.rb". The genetic_algorithm_example.rb file contains:

More about Genetic Algorithms and the Travelling salesman problem

Travelling salesman problem at Wikipedia

Genetic Algorithms at Wikipedia