Mutation-selection equilibrium in finite populations playing a Hawk–Dove game. Communications in Nonlinear Science and Numerical Simulation , 25(1–3), pp. 66-73.

We study the evolution of a ﬁnite population playing a Hawk-Dove game with mixed strategies. Players have a ﬁxed strategy and their offspring inherit the parental strategy, with a probability u of mutating to another strategy. Payoff in the game is the only variation in ﬁtness among individuals, and a selection co-efﬁcient δ measures the importance of the game in the overall ﬁtness. Population evolution is carried out through a Moran process. We compare our numerical simulations with theoretical predictions in earlier work by Tarnita et al. (2009). Our results show that the effect of selection on the abundances of favored strategies is nonlinear, being less intense as δ increases. The mutation rate u has an opposite and stronger effect to that of selection. Heuristic theoretical arguments are given in order to explain this nonlinear relationship.


Introduction
Evolutionary game theory models a population of individuals interacting in a game, each playing different strategies.Each player has a fixed strategy.The payoff of every player will be an average of the payoffs obtained from the games played with every other individual.Payoff is interpreted as fitness, meaning that individuals with higher payoff reproduce faster, and outcompete players of worse strategies.Therefore, the fitness of an individual depends on the composition of the population at a certain moment of time.This is called frequency-dependent selection.
The study of evolutionary games has traditionally been done considering populations of infinite size, where stochastic effects due to "sampling error" are not considered.Many of these models use the replicator equation [1].
However, although populations are sometimes large enough to study them through the replicator equation, there are many cases in which stochastic effects due to small population sizes are very relevant to the evolution of the population.
When considering evolutionary games in finite populations, an evolutionary updating must be done through stochastic methods.Although there are many possible approaches to this study, e.g. the Wright-Fisher process [2] or the pairwisecomparison process [3], we will focus here on the Moran process [4].
The goal of this article is to study the stochastic evolutionary dynamics of well-mixed, finite-sized populations, playing a Hawk-Dove game.We will work on a model developed previously by Corina Tarnita and collaborators [5].Tarnita et al. developed analytical expressions for the average abundance of any strategy in the population in the limit of weak selection, for arbitrary mutation rates.They observed a linear effect of selection on the abundances of different strategies.We examined the model numerically for stronger levels of selection, finding a nonlinear effect.
The organization of this paper is as follows.In Sec. 2, we present the model used throughout the article, with some of the theoretical approximations developed in [5].In Sec. 3, we present some numerical results obtained from this model.Heuristic arguments for the behaviour of the model are given in Sec. 4. Finally, a brief discussion is presented in Sec. 5.

Model description
The model developed in this article is based on a previous work [5], where a finite population of individuals playing a game is considered.There are n pure strategies, and the payoff that strategy i gets after playing against strategy j is given by the ijth element of the n × n payoff matrix A. In this game, every player has a mixed strategy, choosing to play a pure strategy i with probability p i .A mixed strategy is defined by a stochastic vector p = (p 1 , ..., p n ), with 0 ≤ p i ≤ 1 and p 1 + p 2 + ... + p n = 1.The pure strategies are those in which any p i = 1.The payoff of strategy p playing strategy q is A(p, q) = pAq T .
The population size is given by N and its evolution follows a Moran process.Each generation one individual is chosen for reproduction and another for elimination.The total population size remains constant.In our model, the reproducing individual is chosen proportionally to its fitness.Fitness of the player i is dependent on the average payoff that its strategy p i = (p i1 , ..., p in ) gets after playing every other strategy in the population, and it is given by the expression where i = 1...N are the individuals in the population and δ represents the intensity of selection, i.e., the relevance of the game played in the overall fitness of the individuals.This model considers mutation.We represent the probability of mutation, the mutation rate, as u.Thus, the offspring of the chosen individual will inherit the strategy of the parent with a probability 1 − u.With a probability u, it mutates, choosing one mixed strategy uniformly at random from all the possible strategies.Then, another individual is chosen randomly to die.The chosen individual can be the same that was chosen for reproduction.
We focus on estimating the abundance of each strategy through time, in order to discover which strategies are being favored by selection.To this end, we run the process long enough, obtaining the strategies present in the population at every generation.We then estimate the stationary abundances of every strategy, dividing the [0, 1] interval in finite segments, and counting how many times do strategies appear in each segment in our data.This number is then related to the average of all the segments, giving the estimated stationary abundance in each segment.We then plot the abundances of each strategy relative to the average abundance.

Weak selection
In the context of this model, a weak selection, i.e. δ → 0, means that the relevance of this game in the total fitness of the players is small.In other words, there are many components that affect fitness, and the game being played is just one of them.In Ref. [5], the authors developed theoretical results for this model in the case of weak selection.Using a perturbative method employed in previous work [6], they have found the theoretical abundance of strategy p with respect to the average abundance in the equilibrium, where ||S n || = √ n/(n − 1)! is the abundance mean, µ = N u is the rate of mutation in a population of size N .Lp and Hp are defined as and are the conditions needed for p to be favored by selection in the case of low and high mutation, respectively.For an arbitrary mutation rate µ, p is favored by selection if and only if Lp + µ Hp > 0.
All these results hold for large, but finite population sizes, 1 ≪ N ≪ 1/u.

Hawk-Dove game
In order to numerically simulate this model, we use the Hawk-Dove game.This game was first presented by John Maynard Smith and George Price in 1973 [7].This game presents two pure strategies, and is thus a particular case of the general case with n strategies.The two strategies are hawks and doves.We can think of this game as an intra-population fight for resources, partners, or any other conflict.Whenever a hawk encounters another individual, it will fight his opponent, independent of the opponent's strategy.Doves, on the contrary, retreat when the opponent escalates the fight.The benefit of winning a fight is given by b, and the cost of injury in a fight is c.T When two hawks (H) meet, there is a 0.5 probability that either one wins the fight, as they are both equally strong.Therefore, the average payoff is (b − c)/2.When a hawk meets a dove (D), it wins the fight, and the payoff is b.When a dove meets a hawk, its payoff is 0, since it retreats.Finally, when two doves meet, one wins and the other loses without injury.The payoff is b/2.
The payoff matrix is thus Normally, and in our model as well, 0 < b < c.Since b < c, if everyone else plays hawk, it is better to play dove, and vice versa.This means that there is no strict Nash equilibrium -a Nash equilibrium is such that no player can improve its payoff by changing its strategy [9].As a result, hawks and doves can coexist.At the equilibrium, the frequency of hawks is given by b/c.Thus, if c ≫ b, the equilibrium frequency of hawks will be small.
If we consider mixed strategies that play hawk with probability p and dove with a probability 1 − p, then the evolutionarily stable strategy (ESS) -defined as the strategy that has maximum fitness when adopted by the majority of the population and is therefore uninvadable by new mutants [8]-is the mixed strategy that plays hawk with probability p * = b/c.No other strategy can invade this ESS if there is no mutation.
Strategy p = [p, 1 − p] is a function of p only, and thus every mixed strategy can be described just by the parameter p.The strategies are then confined to the [0, 1] interval.
From the previous cited reference [5] the condition that strategy p = [p, 1 − p] is favored for an arbitrary mutation rate becomes This equation describes a parabola, whose tip is given by Note that for µ = 0, then p = p * , as expected.As µ increases, the tip of the parabola is "pushed" towards the closest pure strategy.Thus, if p * < 1 2 , we have p < p * .And similarly if p * > 1 2 .As µ → ∞, the the most favored strategy is one of the two pure strategies.We will study this effect of mutation through numerical simulations.
Finally, the expected abundance of strategy p, from Eq. ( 2), becomes where ||S 1 || = 1 as we are considering xp as a function of p instead of p = [p, 1 − p] and, therefore, we are working in one dimension, and When comparing the results from numerical simulations of the model, Ref. [5] showed that Eq. ( 8) approximated the simulations quite closely, for low values of δ.We will use Eq. ( 8) to compare the numerical simulations of the model to the theoretical curves, and also to observe the divergence between those two approaches as the intensity of selection δ increases.

Results
Here we analyze the effect of selection δ and mutation u on the abundances of the different strategies in the [0, 1] interval playing a Hawk-Dove game.In our numerical simulations, the benefit b is 2 and the cost c is 5. Different choices of b and c will only move the optimal strategy towards the pure strategy p 1 = [1, 0] or the pure strategy p 2 = [0, 1].

The effect of selection δ
The value of δ in this model means, as was commented in the last section, the influence of the game being played on total fitness.Biological organisms are confronted with different challenges during their lifetime that affect their probability of survival and, as a result, their contribution to the population's offspring.Lower δ values mean that the game being played has little relevance when compared to all the other factors that affect fitness, i.e., environment, availability of mates, illnesses, and so on.Higher δ values mean that the game is much more important than all the other factors combined and that these other factors are less influential to fitness.In our model, we compare individuals who are only different in the strategy they use in the Hawk-Dove game, but are equal in all the other aspects.In other words, all of them are equally susceptible to illnesses, environmental changes, predation, and so on.
In all cases, strategies' abundances form a parabola when represented against p, with the optimal strategy being more favored as δ increases.
However, the accuracy of the theoretical parabola, as defined by Eq. ( 8), is lower as δ increases.This is not surprising as in Ref. [5] the authors acknowledged that their approach was useful only when δ → 0. As δ increases, the theoretical curve for a certain value of δ is closer to the numerical simulations for a lower value of δ.This can be observed in Fig. 1, where the theoretical curve for δ = 0.5 is actually closer to the numerical data for δ = 0.7.In other words, the theoretical model is overestimating the effect of selection -for high values of δ, the model predicts that the effect will be more intense than what is actually observed.8) is compared to numerical simulation results (symbols).The parameters in the simulation are N = 10, u = 0.1 and δ = 0.1 (oblique crosses, solid line), 0.3 (crosses, dashed line), 0.5 (circles, dotted line) and δ = 0.1 (asterisks, dotted and dashed line).The tip of the parabola, i.e., the most favored strategy, is at p = 0.395 for all curves, according to Eq. ( 7).This inaccuracy is due to the fact that the theoretical approximation developed in the work [6] is based on a perturbative method which only considers the first terms in the power expansion, namely, those with δ.As δ increases, the accuracy of the perturbation method is reduced.This could be corrected using another equation to fit the data, as can be seen in Fig. 2. The theoretical approximation predicts a linear relationship between δ and the abundance of any strategy (see Eq. ( 2)).The fitted curve shown in Fig. 2 assumes that this relationship is logarithmic, that is, y = 1 + ln(1 + 0.349δ).This expression accurately includes the diminishing effect of δ as it increases.

The effect of mutation u
Mutation is the originator of diversity.When a population is dominated by one strategy, what generates a new strategy is the mutation and, as a result, a new selection process starts.Therefore, the effect of mutation is opposed to that of selection.Where selection eliminates variation, choosing among the optimal strategies, mutation creates more and more diversity.The combined effect of these forces leads to what is known as a mutation-selection equilibrium.
If u is very low, the effect of selection is stronger and, as a result, the favored strategies are more abundant in the population, as can be seen in Fig. 3, which represent the effect of changing u for δ = 0.5.As u increases, mutation dilutes the effect of selection and abundances get closer to the average.We can observe how, the effect of selection when u = 0.01 is much more pronounced than when u = 0.1.Although the theoretical curves do not fit well with the numerical results (due to the high δ value).
Besides, as u increases, the optimal strategy is "pushed" towards p = 0, as   8) is compared to numerical simulation results (symbols).The parameters in the simulation are N = 10, δ = 0.5 and u = 0.01 (crosses, dotted line) or u = 0.1 (oblique crosses, solid line).The tip of the parabola, i.e., the most favored strategy, is at p = 0.395 (a) and p = 0.35 (b), according to Eq. (7).described in Eq. ( 7) and in the previous section.The previous theoretical model [5] is valid for any mutation rate and, therefore, their predictions match up with the numerical simulations.

Maximum abundance
We can study how the effect of selection is diminished by an increase in u.These abundance values decay exponentially, as shown in Fig. 4, where the abundance values of the optimal strategies are plotted as a function of u.This means that, in this model, mutation is stronger than selection, as the abundances increase with selection logarithmically but decrease with mutation exponentially.The strength of mutation in this model is a consequence of the form in which the mutation has been introduced in the model.When an individual mutates, it chooses a new strategy at random from the interval [0, 1].In fact, this is a very strong way to model mutation.Depending on the context of the model, we could introduce mutation in a weaker form: whenever an individual mutates, it chooses a new strategy at random from the subinterval [p+i, p−i], with p being the parental strategy.This way, the strength of mutation would increase with i.Another pos- sibility would be to choose strategies from the [0, 1] interval following a Normal distribution with mean p, the parental strategy.The relationships between abundances and mutation rates u could then follow different expressions that could be explored in further work.Such an approach has been carried out recently [10].
A summary of the combined effect of selection and mutation can be seen in Fig. 5.We can see the abundance values for the optimal strategies for different values of δ and u in Fig. 5 (a).As seen earlier in other figures, the abundance values for the optimal strategies tend to increase with δ and decrease with u, with the effect of u being stronger.We can see the standard deviation for the data in the simulations for different values of δ and u in Fig. 5(b).Higher δ values lead to lower dispersion in the data, as the optimal strategies become more common.Higher u values lead to higher dispersion, as mutation tends to make the abundances of all strategies uniform.When both u and δ are high, the effect of u is more pronounced.

Qualitative analysis
In this section, we develop heuristic arguments to understand the logarithmic relationship between δ and the abundance of the most favored strategy.In order to do so, we have to refer to the work of Antal and collaborators [6], where they develop the linear approximation for x k , the expected abundance of strategy k in the equilibrium, for the case with a finite number m of strategies.In their paper, they write the formula for the expected abundance of strategy k as where n is the number of strategies playing the game, N is the population size, u is the mutation rate and < ∆x sel k > is the variation in x k due to the effect of selection.The value < ∆x sel k > is equal to As this is a fairly complicated expression, it seems reasonable to compute its Taylor series and use it to obtain an expression for < x k >.  for ω k is (12) Introducing this value of ω k into the computation of < ∆x sel k >, and following the calculations done in Ref [6], we arrive at the following expression: where The number of strategies playing the game is m, and a ij is the ijth element of the payoff matrix A. Using this value in Eq. 8 yields All the terms involving δ represent approximately the series of a logarithm, and if we plot the first powers versus δ and compare it with the numerical fit obtained earlier, y = 1 + ln(1 + 0.349δ) (see Fig. 6), we see that there is a good agreement.

Discussion and conclusions
Mutation and selection are opposite forces.Selection takes the fittest strategies which increase their frequencies with time.Mutation, on the other hand, is always generating new strategies.As a result, the population stays out of equilibrium, as mutation is always restarting the selection process.
In our model, selection increases the frequencies of the most favored strategies, while mutation tends to make all abundances closer to the average.The effect of mutation is stronger than that of selection, as stated previously.
We have analyzed, through numerical simulations, the effect of selection and mutation on the evolution of a population playing a Hawk-Dove game with time.We have focused on which strategies are selected and what is their abundance when selection and mutation change.Selection and mutation are opposite forces, with selection tending to eliminate diversity and "select" the optimal strategies and mutation generating new diversity continuously.
We have explored different scenarios involving both high and low selection and mutation, and we have estimated the relationship between the coefficients of selection δ and mutation u and the abundances of the optimal strategies.Heuristic arguments are given to explain these nonlinear relationships.Besides, we have explored the (δ, u) space, showing how these two forces interact and affect the evolution of the population.

Figure 1 :
Figure 1: Plot showing abundance density x in a Hawk-Dove game with b = 2 and c = 5 as a function of the probability p of playing Hawk.The theoretical curve (lines) from Eq. (8) is compared to numerical simulation results (symbols).The parameters in the simulation are N = 10, u = 0.1 and δ = 0.1 (oblique crosses, solid line), 0.3 (crosses, dashed line), 0.5 (circles, dotted line) and δ = 0.1 (asterisks, dotted and dashed line).The tip of the parabola, i.e., the most favored strategy, is at p = 0.395 for all curves, according to Eq.(7).

Figure 2 : 5 ,
Figure 2: Plot showing abundance densities x of the most favored strategy for different δ values.The theoretical prediction (solid line) assumes a linear relationship between the value of δ and the abundance, as seen in Eq. (2).The dashed line follows the expression y = 1 + ln(1 + 0.349δ).Numerical simulation values are represented with filled squares.The pointed line represents an heuristic approximation to the curve, given by the following equation: y = 1 + δ − cδ 2 + c 2 δ 3 , with c = 0.7689, computed using n = 41.The other parameters are N = 10 and u = 0.1.

Figure 3 :
Figure 3: Plot showing abundance density x in a Hawk-Dove game with b = 2 and c = 5 as a function of the probability p of playing Hawk.The theoretical curve (lines) from Eq. (8) is compared to numerical simulation results (symbols).The parameters in the simulation are N = 10, δ = 0.5 and u = 0.01 (crosses, dotted line) or u = 0.1 (oblique crosses, solid line).The tip of the parabola, i.e., the most favored strategy, is at p = 0.395 (a) and p = 0.35 (b), according to Eq.(7).

Figure 4 :
Figure 4: Plot showing abundance densities x of the most favored strategy for different u values.The fitted curve (solid line) follows the expression y = 1.003 + 0.125e −10.189u .Numerical simulation values are represented with filled squares.The other parameters are N = 10 and δ = 0.1.

Figure 5 :
Figure 5: Plot showing (a) abundance densities x for optimal strategies for different values of δ and u and (b) standard deviation in the data from numerical simulations for different values of δ and u.For all simulations N = 10.

Figure 6 :
Figure 6: Plot showing abundance densities x of the most favored strategy for different δ values.The curve obtained from the qualitative analysis (dashed line) has the expression y = 1 + δ − cδ 2 + c 2 δ 3 , with c = 0.7689, computed using m = 41.The fitted curve (solid line) follows the expression y = 1 + ln(1 + 0.349δ).The other parameters are N = 10 and u = 0.1.