Gene pool recombination in genetic algorithms

合集下载

最优保留遗传算法及其收敛性分析

(7)
注 4 1)定理 2说明:适应度低的种群有可能
第 15卷第 1期
何琳等 :最优保留遗传算法及其收敛性分析
65
转移到适应度高或与之源自等的种群;反之,则不可。 2)该结论的获得与交叉操作无关 ,但必须保证变异概率 Pm > 0,其目的是为保证 EGA能发现全局最优解。3)交叉、变异操作方式和概率均不随遗传代数变化,即与时间无关,EGA 的全部状态转移过程可由齐次 Markov链来描述。
EGA 所必须满足的基本条件。
注 2 最优保留的作用(或目的)仅限于保持进
化种群迄今为止所发现的最优解,凡具有这种功能
的 GA 均可称为 EGA。但如何在 GA 运行中实现最
优保留功能 ,则有不同的方法。
注 3 常见的描述 EGA特性的表述为:最优解
实际上,定理 2可进一步写成
{> 0, k≤ i
Pij,kl = 0, k> i
(8)
对于按如下规则排列的 EGA 状态转移概率矩阵,P
可表为
Q11 Q12 … Q1e1 Q21 Q22 … Q2e2 … Qs1 Qs2 … Qses
……………………………………………………
Q11└× × … ×
┐
Q12 × × … ×
本文详细分析了最优保留 GA 的运行机制,指出其收敛的本质原因;在此基础上给出最优保留 GA 更一般的规范化定义及两种实现方式;最后提出一种变形的最优保留 GA。

外文文献—遗传算法

附录I 英文翻译第一部分英文原文文章来源：书名：《自然赋予灵感的元启发示算法》第二、三章出版社：英国Luniver出版社出版日期：2008Chapter 2Genetic Algorithms2.1 IntroductionThe genetic algorithm (GA), developed by John Holland and his collaborators in the 1960s and 1970s, is a model or abstraction of biolo gical evolution based on Charles Darwin’s theory of natural selection. Holland was the first to use the crossover and recombination, mutation, and selection in the study of adaptive and artificial systems. These genetic operators form the essential part of the genetic algorithm as a problem-solving strategy. Since then, many variants of genetic algorithms have been developed and applied to a wide range of optimization problems, from graph colouring to pattern recognition, from discrete systems (such as the travelling salesman problem) to continuous systems (e.g., the efficient design of airfoil in aerospace engineering), and from financial market to multiobjective engineering optimization.There are many advantages of genetic algorithms over traditional optimization algorithms, and two most noticeable advantages are: the ability of dealing with complex problems and parallelism. Genetic algorithms can deal with various types of optimization whether the objective (fitness) functionis stationary or non-stationary (change with time), linear or nonlinear, continuous or discontinuous, or with random noise. As multiple offsprings in a population act like independent agents, the population (or any subgroup) can explore the search space in many directions simultaneously. This feature makes it ideal to parallelize the algorithms for implementation. Different parameters and even different groups of strings can be manipulated at the same time.However, genetic algorithms also have some disadvantages.The formulation of fitness function, the usage of population size, the choice of the important parameters such as the rate of mutation and crossover, and the selection criteria criterion of new population should be carefully carried out. Any inappropriate choice will make it difficult for the algorithm to converge, or it simply produces meaningless results.2.2 Genetic Algorithms2.2.1 Basic ProcedureThe essence of genetic algorithms involves the encoding of an optimization function as arrays of bits or character strings to represent the chromosomes, the manipulation operations of strings by genetic operators, and the selection according to their fitness in the aim to find a solution to the problem concerned. This is often done by the following procedure:1) encoding of the objectives or optimization functions; 2) defining a fitness function or selection criterion; 3) creating a population of individuals; 4) evolution cycle or iterations by evaluating the fitness of allthe individuals in the population,creating a new population by performing crossover, and mutation,fitness-proportionate reproduction etc, and replacing the old population and iterating again using the new population;5) decoding the results to obtain the solution to the problem. These steps can schematically be represented as the pseudo code of genetic algorithms shown in Fig. 2.1.One iteration of creating a new population is called a generation. The fixed-length character strings are used in most of genetic algorithms during each generation although there is substantial research on the variable-length strings and coding structures.The coding of the objective function is usually in the form of binary arrays or real-valued arrays in the adaptive genetic algorithms. For simplicity, we use binary strings for encoding and decoding. The genetic operators include crossover,mutation, and selection from the population.The crossover of two parent strings is the main operator with a higher probability and is carried out by swapping one segment of one chromosome with the corresponding segment on another chromosome at a random position (see Fig.2.2).The crossover carried out in this way is a single-point crossover. Crossover at multiple points is also used in many genetic algorithms to increase the efficiency of the algorithms.The mutation operation is achieved by flopping the randomly selected bits (see Fig. 2.3), and the mutation probability is usually small. The selection of anindividual in a population is carried out by the evaluation of its fitness, and it can remain in the new generation if a certain threshold of the fitness is reached or the reproduction of a population is fitness-proportionate. That is to say, the individuals with higher fitness are more likely to reproduce.2.2.2 Choice of ParametersAn important issue is the formulation or choice of an appropriate fitness function that determines the selection criterion in a particular problem. For the minimization of a function using genetic algorithms, one simple way of constructing a fitness function is to use the simplest form F = A−y with A being a large constant (though A = 0 will do) and y = f(x), thus the objective is to maximize the fitness function and subsequently minimize the objective function f(x). However, there are many different ways of defining a fitness function.For example, we can use the individual fitness assignment relative to the whole populationwhere is the phenotypic value of individual i, and N is the population size. The appropriateform of the fitness function will make sure that the solutions with higher fitness should be selected efficiently. Poor fitness function may result in incorrect or meaningless solutions.Another important issue is the choice of various parameters.The crossover probability is usually very high, typically in the range of 0.7~1.0. On the other hand, the mutation probability is usually small (usually 0.001 _ 0.05). If is too small, then the crossover occurs sparsely, which is not efficient for evolution. If the mutation probability is too high, the solutions could still ‘jump around’ even if the optimal solution is approaching.The selection criterion is also important. How to select the current population so that the best individuals with higher fitness should be preserved and passed onto the next generation. That is often carried out in association with certain elitism. The basic elitism is to select the most fit individual (in each generation) which will be carried over to the new generation without being modified by genetic operators. This ensures that the best solution is achieved more quickly.Other issues include the multiple sites for mutation and the population size. The mutation at a single site is not very efficient, mutation at multiple sites will increase the evolution efficiency. However, too many mutants will make it difficult for the system to converge or even make the system go astray to the wrong solutions. In reality, if the mutation is too high under high selection pressure, then the whole population might go extinct.In addition, the choice of the right population size is also very important. If the population size is too small, there is not enough evolution going on, and there is a risk for the whole population to go extinct. In the real world, a species with a small population, ecological theory suggests that there is a real danger of extinction for such species. Even the system carries on, there is still a danger of premature convergence. In a small population, if a significantly more fit individual appears too early, it may reproduces enough offsprings so that they overwhelm the whole (small) population. This will eventually drive the system to a local optimum (not the global optimum). On the other hand, if the population is too large, more evaluations of the objectivefunction are needed, which will require extensive computing time.Furthermore, more complex and adaptive genetic algorithms are under active research and the literature is vast about these topics.2.3 ImplementationUsing the basic procedure described in the above section, we can implement the genetic algorithms in any programming language. For simplicity of demonstrating how it works, we have implemented a function optimization using a simple GA in both Matlab and Octave.For the generalized De Jong’s test function where is a positive integer andr > 0 is the half length of the domain. This function has a minimum of at . For the values of , r = 100 and n = 5 as well as a population size of 40 16-bit strings, the variations of the objective function during a typical run are shown in Fig. 2.4. Any two runs will give slightly different results dueto the stochastic nature of genetic algorithms, but better estimates are obtained as the number of generations increases.For the well-known Easom functionit has a global maximum at (see Fig. 2.5). Now we can use the following Matlab/Octave to find its global maximum. In our implementation, we have used fixedlength 16-bit strings. The probabilities of crossover and mutation are respectivelyAs it is a maximization problem, we can use the simplest fitness function F = f(x).The outputs from a typical run are shown in Fig. 2.6 where the top figure shows the variations of the best estimates as they approach while the lower figure shows the variations of the fitness function.% Genetic Algorithm (Simple Demo) Matlab/Octave Program% Written by X S Yang (Cambridge University)% Usage: gasimple or gasimple(‘x*exp(-x)’);function [bestsol, bestfun,count]=gasimple(funstr)global solnew sol pop popnew fitness fitold f range;if nargin<1,% Easom Function with fmax=1 at x=pifunstr=‘-cos(x)*exp(-(x-3.1415926)^2)’;endrange=[-10 10]; % Range/Domain% Converting to an inline functionf=vectorize(inline(funstr));% Generating the initil populationrand(‘state’,0’); % Reset the random generatorpopsize=20; % Population sizeMaxGen=100; % Max number of generationscount=0; % counternsite=2; % number of mutation sitespc=0.95; % Crossover probabilitypm=0.05; % Mutation probabilitynsbit=16; % String length (bits)% Generating initial populationpopnew=init_gen(popsize,nsbit);fitness=zeros(1,popsize); % fitness array% Display the shape of the functionx=range(1):0.1:range(2); plot(x,f(x));% Initialize solution <- initial populationfor i=1:popsize,solnew(i)=bintodec(popnew(i,:));end% Start the evolution loopfor i=1:MaxGen,% Record as the historyfitold=fitness; pop=popnew; sol=solnew;for j=1:popsize,% Crossover pairii=floor(popsize*rand)+1; jj=floor(popsize*rand)+1;% Cross overif pc>rand,[popnew(ii,:),popnew(jj,:)]=...crossover(pop(ii,:),pop(jj,:));% Evaluate the new pairscount=count+2;evolve(ii); evolve(jj);end% Mutation at n sitesif pm>rand,kk=floor(popsize*rand)+1; count=count+1;popnew(kk,:)=mutate(pop(kk,:),nsite);evolve(kk);endend % end for j% Record the current bestbestfun(i)=max(fitness);bestsol(i)=mean(sol(bestfun(i)==fitness));end% Display resultssubplot(2,1,1); plot(bestsol); title(‘Best estimates’); subplot(2,1,2); plot(bestfun); title(‘Fitness’);% ------------- All sub functions ----------% generation of initial populationfunction pop=init_gen(np,nsbit)% String length=nsbit+1 with pop(:,1) for the Signpop=rand(np,nsbit+1)>0.5;% Evolving the new generationfunction evolve(j)global solnew popnew fitness fitold pop sol f;solnew(j)=bintodec(popnew(j,:));fitness(j)=f(solnew(j));if fitness(j)>fitold(j),pop(j,:)=popnew(j,:);sol(j)=solnew(j);end% Convert a binary string into a decimal numberfunction [dec]=bintodec(bin)global range;% Length of the string without signnn=length(bin)-1;num=bin(2:end); % get the binary% Sign=+1 if bin(1)=0; Sign=-1 if bin(1)=1.Sign=1-2*bin(1);dec=0;% floating point.decimal place in the binarydp=floor(log2(max(abs(range))));for i=1:nn,dec=dec+num(i)*2^(dp-i);enddec=dec*Sign;% Crossover operatorfunction [c,d]=crossover(a,b)nn=length(a)-1;% generating random crossover pointcpoint=floor(nn*rand)+1;c=[a(1:cpoint) b(cpoint+1:end)];d=[b(1:cpoint) a(cpoint+1:end)];% Mutatation operatorfunction anew=mutate(a,nsite)nn=length(a); anew=a;for i=1:nsite,j=floor(rand*nn)+1;anew(j)=mod(a(j)+1,2);endThe above Matlab program can easily be extended to higher dimensions. In fact, there is no need to do any programming (if you prefer) because there are many software packages (either freeware or commercial) about genetic algorithms. For example, Matlab itself has an extra optimization toolbox.Biology-inspired algorithms have many advantages over traditional optimization methods such as the steepest descent and hill-climbing and calculus-based techniques due to the parallelism and the ability of locating the very good approximate solutions in extremely very large search spaces.Furthermore, more powerful new generation algorithms can be formulated by combiningexisting and new evolutionary algorithms with classical optimization methods.Chapter 3Ant AlgorithmsFrom the discussion of genetic algorithms, we know that we can improve the search efficiency by using randomness which will also increase the diversity of the solutions so as to avoid being trapped in local optima. The selection of the best individuals is also equivalent to use memory. In fact, there are other forms of selection such as using chemical messenger (pheromone) which is commonly used by ants, honey bees, and many other insects. In this chapter, we will discuss the nature-inspired ant colony optimization (ACO), which is a metaheuristic method.3.1 Behaviour of AntsAnts are social insects in habit and they live together in organized colonies whose population size can range from about 2 to 25 millions. When foraging, a swarm of ants or mobile agents interact or communicate in their local environment. Each ant can lay scent chemicals or pheromone so as to communicate with others, and each ant is also able to follow the route marked with pheromone laid by other ants. When ants find a food source, they will mark it with pheromone and also mark the trails to and from it. From the initial random foraging route, the pheromone concentration varies and the ants follow the route with higher pheromone concentration, and the pheromone is enhanced by the increasing number of ants. As more and more ants follow the same route, it becomes the favoured path. Thus, some favourite routes (often the shortest or more efficient) emerge. This is actually a positive feedback mechanism.Emerging behaviour exists in an ant colony and such emergence arises from simple interactions among individual ants. Individual ants act according to simple and local information (such as pheromone concentration) to carry out their activities. Although there is no master ant overseeing the entire colony and broadcasting instructions to the individual ants, organized behaviour still emerges automatically. Therefore, such emergent behaviour is similar to other self-organized phenomena which occur in many processes in nature such as the pattern formation in animal skins (tiger and zebra skins).The foraging pattern of some ant species (such as the army ants) can show extraordinary regularity. Army ants search for food along some regular routes with an angle of about apart. We do not know how they manage to follow such regularity, but studies show that they could move in an area and build a bivouac and start foraging. On the first day, they forage in a random direction, say, the north and travel a few hundred meters, then branch to cover a large area. The next day, they will choose a different direction, which is about from the direction on the previous day and cover a large area. On the following day, they again choose a different direction about from the second day’s direction. In this way, they cover the whole area over about 2 weeks and they move out to a different location to build a bivouac and forage again.The interesting thing is that they do not use the angle of (this would mean that on the fourth day, they will search on the empty area already foraged on the first day). The beauty of this angle is that it leaves an angle of about from the direction on the first day. This means they cover the whole circle in 14 days without repeating (or covering a previously-foraged area). This is an amazing phenomenon.3.2 Ant Colony OptimizationBased on these characteristics of ant behaviour, scientists have developed a number ofpowerful ant colony algorithms with important progress made in recent years. Marco Dorigo pioneered the research in this area in 1992. In fact, we only use some of the nature or the behaviour of ants and add some new characteristics, we can devise a class of new algorithms.The basic steps of the ant colony optimization (ACO) can be summarized as the pseudo code shown in Fig. 3.1.Two important issues here are: the probability of choosing a route, and the evaporation rate of pheromone. There are a few ways of solving these problems although it is still an area of active research. Here we introduce the current best method. For a network routing problem, the probability of ants at a particular node to choose the route from node to node is given bywhere and are the influence parameters, and their typical values are .is the pheromone concentration on the route between and , and the desirability ofthe same route. Some knowledge about the route such as the distance is often used so that ,which implies that shorter routes will be selected due to their shorter travelling time, and thus the pheromone concentrations on these routes are higher.This probability formula reflects the fact that ants would normally follow the paths with higher pheromone concentrations. In the simpler case when , the probability of choosing a path by ants is proportional to the pheromone concentration on the path. The denominator normalizes the probability so that it is in the range between 0 and 1.The pheromone concentration can change with time due to the evaporation of pheromone. Furthermore, the advantage of pheromone evaporation is that the system could avoid being trapped in local optima. If there is no evaporation, then the path randomly chosen by the first ants will become the preferred path as the attraction of other ants by their pheromone. For a constant rate of pheromone decay or evaporation, the pheromone concentration usually varies with time exponentiallywhere is the initial concentration of pheromone and t is time. If , then we have . For the unitary time increment , the evaporation can beapproximated by . Therefore, we have the simplified pheromone update formula:where is the rate of pheromone evaporation. The increment is the amount of pheromone deposited at time t along route to when an ant travels a distance . Usually . If there are no ants on a route, then the pheromone deposit is zero.There are other variations to these basic procedures. A possible acceleration scheme is to use some bounds of the pheromone concentration and only the ants with the current global best solution(s) are allowed to deposit pheromone. In addition, certain ranking of solution fitness can also be used. These are hot topics of current research.3.3 Double Bridge ProblemA standard test problem for ant colony optimization is the simplest double bridge problem with two branches (see Fig. 3.2) where route (2) is shorter than route (1). The angles of these two routes are equal at both point A and pointB so that the ants have equal chance (or 50-50 probability) of choosing each route randomly at the initial stage at point A.Initially, fifty percent of the ants would go along the longer route (1) and the pheromone evaporates at a constant rate, but the pheromone concentration will become smaller as route (1) is longer and thus takes more time to travel through. Conversely, the pheromone concentration on the shorter route will increase steadily. After some iterations, almost all the ants will move along the shorter route. Figure 3.3 shows the initial snapshot of 10 ants (5 on each route initially) and the snapshot after 5 iterations (or equivalent to 50 ants have moved along this section). Well, there are 11 ants, and one has not decided which route to follow as it just comes near to the entrance.Almost all the ants (well, about 90% in this case) move along the shorter route.Here we only use two routes at the node, it is straightforward to extend it to the multiple routes at a node. It is expected that only the shortest route will be chosen ultimately. As any complex network system is always made of individual nodes, this algorithms can be extended to solve complex routing problems reasonably efficiently. In fact, the ant colony algorithms have been successfully applied to the Internet routing problem, the travelling salesman problem, combinatorial optimization problems, and other NP-hard problems.3.4 Virtual Ant AlgorithmAs we know that ant colony optimization has successfully solved NP-hard problems such asthe travelling salesman problem, it can also be extended to solve the standard optimization problems of multimodal functions. The only problem now is to figure out how the ants will move on an n-dimensional hyper-surface. For simplicity, we will discuss the 2-D case which can easily be extended to higher dimensions. On a 2D landscape, ants can move in any direction or , but this will cause some problems. How to update the pheromone at a particular point as there are infinite number of points. One solution is to track the history of each ant moves and record the locations consecutively, and the other approach is to use a moving neighbourhood or window. The ants ‘smell’ the pheromone concentration of their neighbourhood at any particular location.In addition, we can limit the number of directions the ants can move by quantizing the directions. For example, ants are only allowed to move left and right, and up and down (only 4 directions). We will use this quantized approach here, which will make the implementation much simpler. Furthermore, the objective function or landscape can be encoded into virtual food so that ants will move to the best locations where the best food sources are. This will make the search process even more simpler. This simplified algorithm is called Virtual Ant Algorithm (VAA) developed by Xin-She Yang and his colleagues in 2006, which has been successfully applied to topological optimization problems in engineering.The following Keane function with multiple peaks is a standard test functionThis function without any constraint is symmetric and has two highest peaks at (0, 1.39325) and (1.39325, 0). To make the problem harder, it is usually optimized under two constraints:This makes the optimization difficult because it is now nearly symmetric about x = y and the peaks occur in pairs where one is higher than the other. In addition, the true maximum is, which is defined by a constraint boundary.Figure 3.4 shows the surface variations of the multi-peaked function. If we use 50 roaming ants and let them move around for 25 iterations, then the pheromone concentrations (also equivalent to the paths of ants) are displayed in Fig. 3.4. We can see that the highest pheromoneconcentration within the constraint boundary corresponds to the optimal solution.It is worth pointing out that ant colony algorithms are the right tool for combinatorial and discrete optimization. They have the advantages over other stochastic algorithms such as genetic algorithms and simulated annealing in dealing with dynamical network routing problems.For continuous decision variables, its performance is still under active research. For the present example, it took about 1500 evaluations of the objective function so as to find the global optima. This is not as efficient as other metaheuristic methods, especially comparing with particle swarm optimization. This is partly because the handling of the pheromone takes time. Is it possible to eliminate the pheromone and just use the roaming ants? The answer is yes. Particle swarm optimization is just the right kind of algorithm for such further modifications which will be discussed later in detail.第二部分中文翻译第二章遗传算法2.1 引言遗传算法是由John Holland和他的同事于二十世纪六七十年代提出的基于查尔斯·达尔文的自然选择学说而发展的一种生物进化的抽象模型。

遗传算法的发展历程

遗传算法的发展历程遗传算法(Genetic Algorithm, GA)是近年来迅速发展起来的一种全新的随机搜索与优化算法,其基本思想是基于Darw in的进化论和Mendel的遗传学说。

该算法由密执安大学教授Holland及其学生于1975年创建。

此后，遗传算法的研究引起了国内外学者的关注。

遗传算法（Genetic Algorithm）是一类借鉴生物界的进化规律（适者生存，优胜劣汰遗传机制）演化而来的随机化搜索方法。

其主要特点是直接对结构对象进行操作，不存在求导和函数连续性的限定；具有内在的隐并行性和更好的全局寻优能力；采用概率化的寻优方法，能自动获取和指导优化的搜索空间，自适应地调整搜索方向，不需要确定的规则。

遗传算法的这些性质，已被人们广泛地应用于组合优化、机器学习、信号处理、自适应控制和人工生命等领域。

它是现代有关智能计算中的关键技术。

遗传算法的基本运算过程如下:a)初始化:设置进化代数计数器t=0，设置最大进化代数T，随机生成M个个体作为初始群体P(0)。

b)个体评价:计算群体P(t)中各个个体的适应度。

c)选择运算:将选择算子作用于群体。

选择的目的是把优化的个体直接遗传到下一代或通过配对交叉产生新的个体再遗传到下一代。

选择操作是建立在群体中个体的适应度评估基础上的。

d)交叉运算；将交叉算子作用于群体。

所谓交叉是指把两个父代个体的部分结构加以替换重组而生成新个体的操作。

遗传算法中起核心作用的就是交叉算子。

e)变异运算:将变异算子作用于群体。

即是对群体中的个体串的某些基因座上的基因值作变动。

群体P(t)经过选择、交叉、变异运算之后得到下一代群体P(t 1)。

f)终止条件判断:若tT,则以进化过程中所得到的具有最大适应度个体作为最优解输出，终止计算。

1967年，Holland的学生J.D.Bagley在博士论文中首次提出“遗传算法（Genetic Algorithms）”一词。

此后，Holland指导学生完成了多篇有关遗传算法研究的论文。

两个基因共转导的频率。

9.8 Intergenic Recombination and Mapping in Bacteriophages
• 9.1 Bacterial Mutation and Growth
Bacterial Phenotypes
• To do genetics, we need phenotypic variation. • Prior to1943 • The adaptation hypothesis, • spontaneous mutations Morphology/resistance/prototroph(autotroph)/
• In the nonintegrated state, F can pass into F-free cells during cell conjugation.
• When F is integrated, the bacterial chromosome is transferred linearly to an F-free cell during conjugation.
9.1 Bacterial Mutation and Growth Genetic 9.2 Recombination in Bacteria: Conjugation 9.3 Rec Proteins and Bacterial
Recombination 9.4 F Factors and Plasmids 9.5 Bacterial Transformation 9.6 The Genetic Study of Bacteriophages
• During specialized transduction, specific genes near the phage-integration sites on the bacterial chromosome are mistakenly incorporated into the phage genome and transferred to other cells by infection.

GA遗传算法

上的，其目的是把优胜的个体遗传到下一代，选择操作的实现是根据适应度大小按照某种策略从父代中挑选个体进入中间群体。选择算子设计依赖选择概率，个体Xi选择概率定义为
reli
fi
N
fi
i 1
其中fi是群体中第i个个体的适应值，N是群体的规模。
当reli越大时，个体Xi被选择遗传（复制）到下一代的可能性越大。目前常用的遗传选择算子主要有以下几种：
2020/2/19
7
遗传算法的基本流程
（3）适应度函数
适应度函数设计是模拟自然选择，进行遗传进化操作的基础，它的评估是遗传操作的依据。适应度函数值即适应度。由于下面定义的选择概率以适应度为基础，因此适应度是非负的。
方法一：对于求目标函数最大值的优化问题，变换方法为：
F(X) =
f(X)+Cmin if f(X)+Cmin> 0
（1）单点交叉单点交叉右脚简单交叉，具体操作是：在个体基因串中随机设定一个交叉点。实行交叉时，该点前或后的两个个体的部分结构进行互换，并生成两个新个体。当基因链码的长度为n时，可能有 n-1个交叉点位置。
单点交叉算子的具体计算过程如下：
Ⅰ. 对群体中的个体进行两两随机配对。若群体大小为M，则共有 [ M/2 ]对相互配对的个体组。
2020/2/19
14
遗传算法的基本流程
常用的变异形式：
（1）基本变异算子基本变异算子是针对二值基因链码而言。其具体操作是：对群体中基因链码随机挑选C个基因位置并对这些基因位置的基因值以变异概率P取反，即0变成1,1变成0。当C=1时，表示一个基因值取反。基本位变异运算的示例如下所示：
基本位变异 A：1010 1 01010

人工智能专业英语Unit 4

3．What do we need to have in order to apply crossover and mutation? A．Selection criteria B．Recombination C．Random changes D．None of the above
Section B: Genetic Algorithms
II. Choose the best answer to each of the following questions according to the text.
1．Which of the following probabilistic operators do we apply in order to generate the next generation of individuals? A．Selection B．Crossover C．Mutation D．All of the above
2．Which of the following is guaranteed? A．The best node (that is, the one with the lowest heuristic value) is always in the middle of the list. B．The best node (that is, the one with the lowest heuristic value) is always at the end of the list. C．The best node (that is, the one with the lowest heuristic value) is always at the beginning of the list. D．None of the above

遗传算法

进入20世纪90年代以来，进化计算得到了众多研究机构和学者的高度重视，新的研究成果不断出现、应用领域不断扩大。目前，进化计算已成为人工智能领域的又一个新的研究热点。
11
5.3.2 遗传算法——研究内容
• 性能分析。遗传算法的性能分析一直都是遗传算法研究领域中最重要的主题之一。在遗传算法中，群体规模、杂交和变异算子的概率等控制参数的选取是非常困难的，同时它们又是必不可少的实验参数。遗传算法还存在一个过早收敛问题，也就是说遗传算法的最后结果并不总是达到最优解，怎样阻止过早收敛问题是人们感兴趣的问题之一。另外，为了拓广遗传算法的应用范围，人们在不断研究新的遗传染色体表示法和新的遗传算子。
(3) 令t=0，随机选择N个染色体初始化种群P(0)； (4) 定义适应度函数f（f>0）； (5) 计算P(t)中每个染色体的适应值； (6) t=t+1； (7) 运用选择算子，从P(t-1)中得到P(t)； (8) 对P(t)中的每个染色体，按概率Pc参与交叉； (9) 对染色体中的基因，以概率Pm参与变异运算； (10) 判断群体性能是否满足预先设定的终止标准，若不满足则返回(5)。
利人，遗传学的奠基人。
“种瓜得瓜，种豆得豆” “龙生龙，凤生凤，老鼠生儿打地洞”
6
在自然界，构成生物基本结构与功能的单位是细胞（Cell）。细胞中含有一种包含着所有遗传信息的复杂而又微小的丝状化合物，人们称其为染色体（Chromosome）。在染色体中，遗传信息由基因（Gene）所组成，基因决定着生物的性状，是遗传的基本单位。染色体的形状是一种双螺旋结构，构成染色体的主要物质叫做脱氧核糖核酸(DNA)，每个基因都在DNA长链中占有一定的位置。一个细胞中的所有染色体所携带的遗传信息的全体称为一个基因组(Genome)。细胞在分裂过程中，其遗传物质DNA通过复制转移到新生细胞中，从而实现了生物的遗传功能。

遗传算法的详解及应用

遗传算法的详解及应用遗传算法（Genetic Algorithm，GA）是一种模拟自然选择和遗传过程的算法。

在人工智能和优化问题中得到了广泛的应用。

本文将详细介绍遗传算法的基本原理和优化过程，并探讨它在实际应用中的价值和局限性。

一、遗传算法的基本原理遗传算法的基本原理是通过模拟生物进化的过程来寻找一个问题的最优解。

在遗传算法中，优秀的解决方案（也称为个体，Individual）在进化中拥有更高的生存几率，而劣质的解决方案则很快被淘汰。

在遗传算法的过程中，每个个体由若干个基因组成，每个基因代表某种特定的问题参数或者状态。

通过遗传算法，我们可以找到问题最优的解或者其中一个较优解。

遗传算法的基本流程如下：1. 初始化群体（Population）：首先，我们需要随机生成一组初始解作为群体的个体。

这些个体被称为染色体（chromosome），每一个染色体都由一些基因（gene）组成。

所以我们可以认为群体是由很多染色体组成的。

2. 选择操作（Selection）：选择运算是指从群体中选出一些个体，用来繁殖后代。

其目的是让优秀的个体留下更多的后代，提高下一代的平均适应度。

在选择操作中，我们通常采用轮盘赌选择（Roulette Wheel Selection）法、锦标赛（Tournament）法、排名选择（Ranking Selection）法等方法。

3. 交叉操作（Crossover）：交叉运算是指随机地从两个个体中选出一些基因交换，生成新的染色体。

例如，我们可以将染色体A和B中的第三个基因以后的基因交换，从而产生两个新的染色体。

4. 变异操作（Mutation）：变异运算是指随机改变染色体中的个别基因，以增加多样性。

例如，我们随机将染色体A的第三个基因改变，从而产生一个新的染色体A'。

5. 适应度评估（Fitness Evaluation）：适应度评估是指给每一个个体一个适应度分数，该分数是问题的目标函数或者优化函数。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

GenePoolRecombinationinGeneticAlgorithmsHeinzM¨uhlenbeinGMD53754St.AugustinGermanymuehlenbein@gmd.deHans-MichaelVoigtT.U.Berlin13355BerlinGermanyvoigt@fb10.tu-berlin.de

Abstract:Anewrecombinationoperator,calledGenePoolRecombination(GPR)isintroduced.InGPR,thegenesarerandomlypickedfromthegenepooldeﬁnedbytheselectedpar-ents.ThemathematicalanalysisofGPRiseasierthanfortwo-parentrecombination(TPR)normallyusedingeneticalgorithms.Therearedifferenceequationsforthemarginalgenefrequenciesthatdescribetheevolutionofapopulationforaﬁtnessfunc-tionofsize.ForsimpleﬁtnessfunctionsTPRandGPRperformsimilarly,withaslightadvantageforGPR.Furthermorethemathematicalanalysisshowsthatageneticalgorithmwithonlyselectionandrecombinationisnotaglobaloptimizationmethod,incontrasttopopularbelief.

Keywords:Differenceequations,geneticalgorithms,Hardy-Weinbergequilibrium,recombination.

1.IntroductionGeneticalgorithms(GAs)useatleastthreedifferentcomponentsforguidingthesearchtoanoptimum—selection,mutationandrecombination.Understandingtheevolu-tionofgeneticpopulationsisstillanimportantproblemforbiologyandforscientiﬁcbreeding.M¨uhlenbeinandSchlierkamp-Voosen(1993,1994)haveintroducedclassicalapproachesfrompopulationgenetics,thescienceofbreeding,andstatisticstoanalyzegeneticalgorithms.Theydescribestheevolutionofgeneticpopulationsasadynamicalsystembydifferenceordifferentialequations.AnalyzingGAsthatusebothrecombi-nationandselectionturnsouttobeespeciallydifﬁcult.Theproblemisthatthemat-ingoftwogenotypescreatesacomplexlinkagebetweengenesatdifferentloci.Thislinkageisveryhardtomodelandrepresentsthemajorprobleminpopulationgenetics(Naglyaki1992).Forsimplelinearﬁtnessfunctionswehavefoundapproximatesolutionstotheequa-tionsthatdescribetheevolutionofageneticpopulationthroughselectionandrecom-bination.Lookingcarefullyattheassumptionsleadingtotheapproximation,wefoundthattheequationsobtainedwouldbeexactifadifferentrecombinationschemewereused.Thisrecombinationschemewecallgenepoolrecombination(GPR).InGPR,foreachlocusthetwoallelestoberecombinedarechosenindependentlyfromthegenepooldeﬁnedbytheselectedparentpopulation.Thebiologicallyinspiredideaofre-strictingtherecombinationtotheallelesoftwoparentsforeachoffspringisabandoned.Thelatterrecombinationwewillcalltwo-parentrecombination(TPR).Theideaofusingmorethantwoparentsforrecombinationisnotnew.AlreadyM¨uhlenbein(1989)usedeightparents;theoffspringallelewasobtainedbyamajorityvote.Multi-parentrecombinationhasalsobeeninvestigatedrecentlybyEiben,RaueandRuttkay(1994)thoughtheirresultsaresomewhatinconclusive.Forbinaryfunc-tionsthebit-basedsimulatedcrossover(BSC)ofSyswerda(1993)issimilartoGPR.However,hisimplementationmergedselectionandrecombination.Animplementa-tionofBSCwhichseparatesselectionandrecombinationwasempiricallyinvestigatedbyEshelmanandSchaffer(1993).GPRisanextensionofBSC,itcanbeusedforanyrepresentation—discreteorcontinuous.InthispaperwewillinvestigateTPRandGPRfordiscretebinaryfunctions.ItwillbeshownthatGPRiseasiertoanalyzethanTPR.Furthermore,itconvergesfaster.Nevertheless,inmanycasesTPRcanbeconsideredasanapproximationtoGPR.

2.ResponsetoselectionInthissectionwesummarizethetheorypresentedinM¨uhlenbeinandSchlierkamp-Voosen(1993,1994).Letbetheaverageﬁtnessofthepopulationatgeneration.Theresponsetoselectionisdeﬁnedas

(1)Theamountofselectionismeasuredbytheselectiondifferential(2)whereistheaverageﬁtnessoftheselectedparents.Theequationfortheresponsetoselectionrelatesand:

(3)Thevalueiscalledtherealizedheritability.Formanyﬁtnessfunctionsandselec-tionschemes,theselectiondifferentialcanbeexpressedasafunctionofthestandarddeviationoftheﬁtnessofthepopulation.Fortruncationselection(selectingthebestindividuals)andfornormallydistributedﬁtness,theselectiondifferentialisproportionaltothestandarddeviation(Falconer1981):

Fornormallydistributedﬁtnessthefamousequationfortheresponsetoselectionisobtained(Falconer1981):

(4)Theaboveequationisvalidforalargerangeofdistributions,notjustforanormaldistribution.Theresponsedependsontheselectionintensity,therealizedheritability,andthestandarddeviationoftheﬁtnessdistribution.Inordertousetheaboveequationforprediction,onehastoestimateand.Theequationalsogivesadesigngoalforgeneticoperators—tomaximizetheproductofheritabilityandstandarddeviation.Inotherwords,iftworecombinationoperatorshavethesameheritability,theoperatorcreatinganoffspringpopulationwithlargerstandarddeviationistobepreferred.Theequationdeﬁnesalsoadesigngoalforaselectionmethod—tomaximizetheproductofselectionintensityandstandarddeviation.Insimplerterms,iftwoselectionmethodshavethesameselectionintensity,themethodgivingthehigherstandardde-viationoftheselectedparentsistobepreferred.Forproportionateselectionasusedbythesimplegeneticalgorithm(Goldberg1989)itwasshownbyM¨uhlenbeinandSchlierkamp-Voosen(1993)that