Blitz GP V1.0 ------------- Introduction ------------ Genetic Programming is an Evolutionary Method for the automatic creation of computer programs. Evolutionary Methods work by keeping a collection of individuals that represent potential solutions to a particular problem. Progress is made by holding onto the individuals that produce better results and discarding badly-performing individuals. This is most usually done by mating together individuals that perform well. In the Genetic Algorithms paradigm, it is common to define a fitness function whereby the perfomance of each and every individual is accessed. Then, individuals are selected for mating with a probability proportional to their fitness. This imposes a selection force onto the population (the best survive, the weak die), slowly improving the population base and thus, because of population diversity, slowly increasing the chance of producing an optimal individual. In Genetic Programming the individuals are computer programs and they mate with each other through the exchange of program subtrees in their code. (ie they exchange fractions of code). The Blitz-GP Virtual Machine Programming Model ---------------------------------------------- BGP employs a very simple virtual machine that is a mix between the 680x0 and the Z80. It has 3 registers A,B,X and no direct operand addressing modes. [though this maybe added in feature releases]. It supports conditionial jumps and subroutines. Subroutines lie on different memory pages [called trees within the genetic programming context] and there is a small CPU stack for keeping track of subroutine jumps, a small amount of memory referenced by X, and two CPU flags, NZ (negative/Zero flag) and EXC (exception flag). Getting Started With Blitz-GP ----------------------------- For a start, load up bgpmain.bb2 and compile it, but MAKE SURE you have the proper INCDIR declared, otherwise it will fail to find the includes. The program is set up for the problem of symbolic regression, whereby a program has to be created that can calculate the unknown function. (which in this case is f(x)=x^4+x^3-x^2-2*x). This is done by running each individual's program a few times, with register A of the virtual machine being set to x. After the program finishes (either by exiting normally or by creating an exception) the value of register B is expected to hold the desired result. The individuals are evaluated by measuring the difference between the value of B at the end of each trial and the required result... You may wish to play around with different functions, and see how well BGP performs: 1) Open funcdef.bb2 2) Look in Function give_result 3) Change what that function returns 4) Do not use the ^ operator, as it is inaccurate 5) The Virtual Machine only has integers (of course) But you could do FP calculations anyway if you define a proper translation mechanism from FP to integer and back.. 6) Oh, and SAVE it, yes? You could also play around with the parameters defined in gp.bb2 (please look in next section if you need an explanation) #NOF_TREES - don't change unless you add more subroutines #STACK_SIZE - 32 is fine. More is unnecessary - Selection Methods roulette - Parent selection Is Fitness Proportional [disabled always] elitism - Best Individuals replace Worst [disabled always] tour_size - Number of individuals that compete in the tournament (note that selection is currently always tournament selection) - Population Details pop_size - Size of population demes - wether the population is split into seperate demes neighbourhood - size of each deme mutation_rate - Probability of mutation. - Initialization min_init_tree_size - Minimum initial tree size max_init_tree_size - Maximum initial tree size - Other mem_size - memory size of VM max_trials - Number of trials per individual max_time - Maximum time allowed per trial sim_delay - SlowDown when monitoring individual in simulation Some more stuff about Genetic Programming ***************************************** Roulette Selection ------------------ Individuals are selected for mating randomly from the population, with probability proportional to their fitness. Tournament Selection -------------------- Each individual looks at the fitness of a number of other individuals (equaling tour_size) and then mates with the best of them. Elitism ------- In elitism, the scum of the population are replaced by the elite. This can be done in the context of tournament selection by copying the tour's best individual to the tour's worst individual. Demes ----- With demes turned on, individuals can only mate with individuals within a range of neighbourhood/2. This helps increase GENETIC DIVERSITY. If demes are turned off, then mating is PANMICTIC, which means that everybody can mate with anybody (including themselves). Panmictic mating can lead to premature convergence in particularly difficult problems, as all of the population is driven to a single solution, which, like a local minimum, is not the best, but the easiest to achieve. Adding a sense of distance with the use of Demes helps maintain a diversity of solutions. However, making the neighbourhood too small may result in another problem that also leads to premature convergence, but this time through population STAGNATION. This phenomenon is called INBREEDING. Mutation -------- Mutation works by making random changes into an individual's program. Mutation can help a population evolve, but it should not happen too often, as there is a high probability of it not being beneficial. Mutation rates usually range around 0.2% in the literature, the maximum used being 5%, the lowest being 0.01%. I use 2% :).