Plan
Comptes Rendus

Biological modelling / Biomodélisation
Evolutionary games with variable payoffs
[Jeux évoliutionnels avec gains et coûts variables]
Comptes Rendus. Biologies, Volume 328 (2005) no. 4, pp. 403-412.

Résumés

Matrix games, defined by a set of strategies and a corresponding matrix of payoffs, are commonly used to model animal populations because they are both simple and generate meaningful results. It is generally assumed that payoffs are independent of time. However, the timing of contests in real populations may have a marked effect on the value of rewards. We consider matrix games where the payoffs are functions of time. Rules are found which hold in this more general situation, and the complexity of possible behaviour is underlined by demonstrating other conditions which do not hold and an illustrative game.

Les jeux matriciels prennent en compte un ensemble de stratégies avec une matrice de gain et de coûts. Ces jeux sont fréquemment utilisés pour modéliser les populations animales parce qu'ils sont simples et génèrent des résultats dont l'interprétation est aisée. Dans ces modèles, il est habituellement supposé que les gains et les coûts sont indépendants du temps. Cependant, la durée des rencontres entre individus dans les populations réelles peut avoir un effet important sur la valeur des gains. Nous considérons des jeux matriciels pour lesquels les gains et les coûts sont fonctions du temps. Nous obtenons des règles valables dans ce cas plus général. La compléxité du comportement est souligné en recherchant d'autres conditions dans le cas non autonome et en présentant un exemple de jeu illustratif de la méthode.

Métadonnées
Reçu le :
Accepté le :
Publié le :
DOI : 10.1016/j.crvi.2004.12.001
Keywords: Evolutionarily stable strategy, Replicator dynamic, Environmental change, Bimatrix games, Darwinian fitness
Mot clés : Stratégie évolutivement stable, Dynamique du réplicateur, Changement environnemental, Jeux à deux matrices, Avantage sélectif darwinien
Mark Broom 1

1 Centre for Statistics and Stochastic Modelling, Department of Mathematics, University of Sussex, Brighton BN1 9RF, UK
@article{CRBIOL_2005__328_4_403_0,
     author = {Mark Broom},
     title = {Evolutionary games with variable payoffs},
     journal = {Comptes Rendus. Biologies},
     pages = {403--412},
     publisher = {Elsevier},
     volume = {328},
     number = {4},
     year = {2005},
     doi = {10.1016/j.crvi.2004.12.001},
     language = {en},
}
TY  - JOUR
AU  - Mark Broom
TI  - Evolutionary games with variable payoffs
JO  - Comptes Rendus. Biologies
PY  - 2005
SP  - 403
EP  - 412
VL  - 328
IS  - 4
PB  - Elsevier
DO  - 10.1016/j.crvi.2004.12.001
LA  - en
ID  - CRBIOL_2005__328_4_403_0
ER  - 
%0 Journal Article
%A Mark Broom
%T Evolutionary games with variable payoffs
%J Comptes Rendus. Biologies
%D 2005
%P 403-412
%V 328
%N 4
%I Elsevier
%R 10.1016/j.crvi.2004.12.001
%G en
%F CRBIOL_2005__328_4_403_0
Mark Broom. Evolutionary games with variable payoffs. Comptes Rendus. Biologies, Volume 328 (2005) no. 4, pp. 403-412. doi : 10.1016/j.crvi.2004.12.001. https://comptes-rendus.academie-sciences.fr/biologies/articles/10.1016/j.crvi.2004.12.001/

Version originale du texte intégral

1 Introduction

We start by explaining some of the basic concepts of evolutionary game theory, introduced in the classic paper [1] (see also [2]), which will be of use in the rest of the paper. In particular we discuss matrix games between symmetric players and their equivalent between asymmetric players, bimatrix games. We also use the concept of the replicator dynamic to consider the evolution of strategies as a function of time.

1.1 Matrix games

The following idea is useful for modelling a population of animals which compete in pairwise conflicts for some resource, which could be food or mates, for example. It is assumed that all members of the population are indistinguishable and each individual is equally likely to face each other individual. There are a finite number of pure strategies available to the players to play in a particular game. Let U be the set of pure strategies so that U={1,,n}. Given the strategies played the outcome is determined; if player 1 plays i against player 2 playing j then player 1 receives reward aij (player 2 receives aji) representing an adjustment in Darwinian fitness. The value aij can be thought of as an element in the n×n matrix A, the payoff matrix.

An animal need not play the same pure strategy every time, it can play a mixed strategy, i.e., play i with probability pi for each of i=1,,n. Thus the strategy played by an animal is represented by a probability vector p. The expected payoff to player 1 playing p against player 2 playing q, which is written as E[p,q], is

E[p,q]=aijpiqj=pTAq
A strategy p is a Nash equilibrium if qTAppTAp for all alternative strategies q (so a strategy is a Nash equilibrium if it is a best reply to itself).

The support of p is defined as S(p)={i:pi>0}.

p is an internal strategy if S(p)=U.

p is thus a Nash equilibrium if (Ap)i=λ, iS(p), (Ap)iλ, iS(p) for some constant λ and is thus an internal Nash equilibrium if S(p)=U and (Ap)i=λi.

A strategy p is an Evolutionarily Stable Strategy (ESS) if

  • (i) qTAppTAp and
  • (ii) if qTAp=pTAp then qTAq<pTAq
for all alternative strategies q.

A matrix may possess a unique ESS, no ESSs or many ESSs. See [3,4] for a discussion of the possible complexity of the ESS structure of a matrix.

1.2 Bimatrix games

The assumptions underlying the bimatrix game model are the same as for the matrix game model, except that pairwise contests are fought between individuals in asymmetric positions, so that the individual designated player 1 has a different set of pure strategies U1 to the set available to player 2 (U2). If player 1 plays its strategy i against player 2 playing its strategy j, then player 1 receives reward aij and player 2 receives reward bji. The payoffs combine to form the payoff matrices A and B. In the same way individuals can play mixed strategies, so that if player 1 plays p and player 2 plays q, the rewards to the players are pTAq and qTBp, respectively.

The two strategy pair p1,p2 is a Nash equilibrium pair, if

p1TAp2q1TAp2andp2TBp1q2TBp1
for any alternative strategies q1, q2.

The strategy pair p1, p2 is an ESS if it is a Nash equilibrium pair,

and wheneverp1TAp2=q1TAp2thenp1TAq2>q1TAq2
and wheneverp2TBp1=q2TBp1thenp2TBq1>q2TBq1
which is only possible if both these strategies are pure [5].

1.3 Replicator dynamics

Let us assume that individuals can only play pure strategies, and let the proportion of players of pure strategy i at a particular time be pi (i=1,,n), so that the average population strategy (the population state) is the vector p=(pi), with the expected payoff (in terms of Darwinian fitness) of an i-player in such a mixture being (Ap)i and the overall expected payoff in the population being pi(Ap)i=pTAp. Then the standard replicator dynamic (continuous) for the matrix game with payoff matrix A is defined by the differential equation

dpidt=pi[(Ap)ipTAp]
Thus the proportion of players which play the better strategies increases with time (what determines a good strategy depends upon the composition of the population). A point in n-dimensional space, represented by the vector p, is locally stable if it is an ESS (this is not necessarily true for the discrete dynamic, [6]). p is called an Evolutionarily Stable State rather than Strategy, since no individual actually plays p, it is rather the average of those strategies played. The replicator equation has been applied in very many situations (see [7,8]).

Assuming evolution under the replicator dynamic (or any other dynamic), the time average of the population state v (where vT=(v1,,vn)) at time T is the vector TA(v), whose ith element TA(v)i is defined by

TA(v)i=1T0Tvidt

Note that:

  • (1) adding a constant to any column of the payoff matrix makes no difference to the Nash equilibria, or the trajectory of the path taken by the population under the replicator dynamic (including time factors), so making no difference to the game at all.
  • (2) If the entire matrix is multiplied by a positive constant, then the Nash equilibria remain constant and the trajectory of the replicator dynamic is unaltered, but speed along the trajectory may be changed, and thus the time average of the population state may also be affected.

1.4 Nonconstant payoffs

The majority of game models deal with a fixed payoff structure, so that if the strategies adopted by all the combatants are known, then the payoffs are given. This corresponds with the payoff matrix of constants, A. In the real world, however, the time at which contests occur can be crucial. As the breeding season develops there is a natural variation in the rewards available for any given contest [9,10]. But due to environmental changes, variations may occur from year to year, and even day to day due to unpredictable effects such as the weather [11].

We consider a matrix game where the payoffs vary with time. In particular, if player 1 plays i against player 2 playing j at time t, then player 1 receives payoff aij(t). Clearly a lot of the concepts from the static games must now be reconsidered; for instance, p may be a Nash equilibrium of a particular constant payoff matrix (a snapshot in time) but what happens in the long term as the matrix changes? We consider how the population state changes under the replicator dynamic and ask what rules we can establish.

In general we assume that the entries in the payoff matrix are bounded, so that a1<max|aij(t)|<a2 for all i,j,t, 0<a1<a2. It is also usually assumed that A is continuous and that indeed it cannot change too quickly, so that

|ddtaij(t)|<a3i,j,t
The class of payoff matrices Λ is defined as follows;
L=(lij)Λif and only if lij=ljfor alli,j
Thus Λ is the set of matrices where all elements in column j have the same value, for each column (such matrices yield the same reward to any strategy against each opposing strategy). We use the term Λ(t) to indicate a matrix which varies in time, but is always a member of Λ.

Finally, we assume that a4<max|aij(t)akj(t)| for some i,j,k and a4>0, and for all t. This ensures that the game does not become completely degenerate by bounding it away from the set Λ.

2 Results

Firstly we introduce some general results, and their consequences, before moving on to consider a particular application of the theory.

2.1 General results

In this section the variable payoff matrix with elements aij(t) is simply written as A and the population state is the vector v.

Theorem 1

A converges does not imply that TA ( v ) converges.

This results follows simply from the fact that even for constant A, not only does v not always converge, but neither does TA(v) (see [8]). Complex population dynamics can thus occur even with constant payoffs. What can we say about the case where payoffs vary?

Theorem 2

If v v , where v ( 0 ) has no zero elements, then A g ( t ) A + Λ ( t ) for some continuous function g ( t ) and A , a matrix of constants, such that v is a Nash equilibrium of A .

So if the prevalent population state converges, then it must be the case that the payoffs converge either to a constant payoff matrix, or one which varies in time in such a way that the ratio

ai1,j1(t)ai2,j2(t)ai3,j3(t)ai4,j4(t)
is independent of time, for any combination of strategies i1,i2,i3,i4,j1,j2,j3,j4 for which j1=j2 and j3=j4. This is an extremely severe restriction implying strong environmental stability over time.

Theorem 3

If

  • (a) a i j ( t ) = α i j + f i j ( t ) and 1 / T 0 T | f i j ( t ) f k j ( t ) | d t 0 i , j , k and
  • (b) any element v i ( t ) which approaches the boundary does so sufficiently slowly so that 1 / T log ( v i ( T ) ) 0 ,
then TA ( v ) converges to v which is a Nash equilibrium of = ( α i j ) .

If the payoffs vary with time, but to a limited extent (in particular the environmental impact is such that there is not regular displacement from the mean, except possibly of vanishingly small size), and that this is not strong enough to eliminate strategies at a sufficiently fast (exponential) rate then convergence of the time average of the population state occurs. We do not have a stronger result for a general constant payoff matrix; thus the assumption of constant payoffs can be relaxed a little without affecting the dominant long term behaviour.

Proposition 1

IfTA(A)and ifvis an internal Nash equilibrium of ℵ, then the distance ofvfromTA(v), the length of the vectorvTA(v), can converge to a value which is arbitrarily close to 1.

This result demonstrates that considering the long-term mean payoffs is insufficient to find the long-term average state; indeed it is possible to find situations when considering the long-term mean payoffs would give you the worst possible estimate of the mean state.

Proposition 2

TA ( v ) converges does not imply that we can find bounded function 0 < k 1 < g ( t ) < k 2 and matrix Λ ( t ) , defined as above, such that TA ( A ) converges, where A = ( A Λ ( t ) ) / g ( t ) .

(Note that we need this more elaborate statement, rather than just nonconvergence of TA(A), as this can be given by the trivial case of A=Λ(t)+K for constant matrix K, and suitable Λ(t)).

Even if the mean population state converges this does not imply convergence of the underlying payoffs (or the relative size of payoffs against the same pure strategy), and even time average convergence is not guaranteed. This is especially surprising given that convergence of the time average of v is not guaranteed even for constant payoffs. Thus observation of this behaviour, without convergence of v, could be the result of a wide range of underlying phenomena.

2.2 The two stage game

One application of the idea of variable payoffs is in the context of multiple stage games, even when the real natural payoffs are constant. Suppose that a population of animals compete in pairwise contests, where an animal's strategy is decided by a pair of ‘choices’. The first choice that an animal makes then decides the particular type of contest that the two animals are involved in, after which the individuals both pick a second choice, which decides the payoffs that each receive (for other examples of such contests see [12,13]). The two players make their choices at each stage simultaneously, and both know the result of the first stage before playing the second. For instance, the first ‘choice’ could be horn size, which is apparent to both players before any contest ensues. The second stage is thus a subgame within this contest; the available choices of which may be conditional upon the horn size of the participants (thus there is a payoff matrix for each pair of horn sizes).

Note that when the first stage choices are different, then this induces an asymmetry into the contest. We assume that if player 1 picks i and player 2 picks j then the reward to player 1 is decided by the payoff matrix Bij. If i=j then we have the standard matrix game with payoff matrix Bii, otherwise we have a bimatrix game with payoff matrices Bij,Bji, respectively.

The payoff aij is thus the expected reward for first stage strategy i against first stage strategy j, which depends in turn upon the strategies prevalent in such contests, i.e., aij=pijTBijpji where pij=(pij,k)k is the population state of i-players when facing j-players for the second stage of the game (and so pji is the population state of their potential opponents in this contest). When considering the strategy of i against that of j, ij, pij takes the place of p1 and pji takes the place of p2 in the bimatrix games earlier described (for individuals of the same first stage strategy pii takes the place of p in a matrix game). It is assumed that the strategies involved in this subgame evolve in the same manner as the first stage choices, and so pij varies according to the replicator equation

ddtpij,k=pij,k((Bijpji)kpijTBijpji)

Theorem 4

Ifpijpiji,j (where pii is an ESS of Bii and pij,pji is an ESS pair of Bij,Bji for ij), then aij(t)=pijTBijpji satisfies condition (a) for Theorem 3 with αij=pijTBijpji.

Corollary

Suppose that

  • (i) p i i is an ESS of B i i and p i j , p j i is an ESS pair of B i j , B j i for i j , i , j .
  • (ii) p i j p i j i , j .
  • (iii) aij(t)=pijTBijpij, with none of the first stage strategies approaching 0 at an exponential rate (except any not featuring in the support of v below) so that condition (b) of Theorem 3 is satisfied.
Then TA ( v ) = v exists, and v is a Nash equilibrium of = ( α i j ) , where α i j = p i j T B i j p j i .

Suppose that in a two stage game there are two strategies in the first stage, and then two strategies in each of the four possible second stage situations, defined by the payoff matrices

Bij=|bij(11)bij(12)bij(21)bij(22)|
The different stable solutions within each of the matrices are shown in Tables 1 and 2. The solution for the matrix Bii (i=1 or 2) are given in Table 1.

Table 1

ESSs and payoffs for Bii

Condition b i i ( 11 ) > b i i ( 21 ) b i i ( 11 ) < b i i ( 21 ) b i i ( 11 ) > b i i ( 21 ) b i i ( 11 ) < b i i ( 21 )
b i i ( 12 ) > b i i ( 22 ) b i i ( 12 ) < b i i ( 22 ) b i i ( 12 ) < b i i ( 22 ) b i i ( 12 ) > b i i ( 22 )
ESS(s) (1) (2) (1), (2) p
payoff b i i ( 11 ) b i i ( 22 ) bii(11) or bii(22)
Table 2

ESSs for B12 and B21

Conditions b 21 ( 11 ) > b 21 ( 21 ) b 21 ( 11 ) > b 21 ( 21 ) b 21 ( 11 ) < b 21 ( 21 ) b 21 ( 11 ) < b 21 ( 21 )
Conditions b 21 ( 12 ) > b 21 ( 22 ) b 21 ( 12 ) < b 21 ( 22 ) b 21 ( 12 ) < b 21 ( 22 ) b 21 ( 12 ) > b 21 ( 22 )
b 12 ( 11 ) > b 21 ( 21 ) b 12 ( 12 ) > b 21 ( 22 ) (1,1) (1,1) (1,2) (1,2)
b 12 ( 11 ) > b 21 ( 21 ) b 12 ( 12 ) < b 21 ( 22 ) (1,1) (1,1) or (2,2) (p,q) (2,2)
b 12 ( 11 ) < b 21 ( 21 ) b 12 ( 12 ) > b 21 ( 22 ) (2,1) (p,q) (1,2) or (2,1) (1,2)
b 12 ( 11 ) < b 21 ( 21 ) b 12 ( 12 ) < b 21 ( 22 ) (2,1) (2,2) (2,1) (2,2)

When one player plays 1 and the other 2, we have the bimatrix game defined by B12,B21. Table 2 gives the strategies in the ESSs.

When there are no pure ESSs and a single Nash equilibrium pair for the bimatrix game, it is shown in [14] that the pair (p,q) is a centre and that the time averages of p(t) and q(t) are p and q, respectively. The time averages of the payoffs are also shown to be pTB12q and qTB21p, respectively. Note that in the cases where there are two pure ESS pairs, there is an internal equilibrium which is a saddle point, and all trajectories converge to one or other of the ESS pairs.

2.2.1 A numerical example

Suppose that the payoff values from the above example are

b11(12)=b11(21)=b12(12)=b12(21)=b21(12)=b21(21)=b22(12)=b22(21)=1
b11(11)=2,b11(22)=3
b12(11)=b12(22)=4
b21(11)=5,b21(22)=10
b22(11)=2,b22(22)=3
This yields two possible solutions each for B11, B22 and the pair B12, B21 (the maximum number in each case). Thus there are 8 possible limiting matrices A, depending upon initial conditions. Each of these has an internal ESS. The possible matrices, each with the probability of choosing first stage strategy 1 in brackets, are given below.
|2452|(2/5),|24102|(1/5)
|2453|(1/4),|24103|(1/9)
|3452|(1/2),|34102|(2/9)
|3453|(1/3),|34103|(1/8)
In effect there are eight different ESS solutions from a situation where each individual has four choices, two at each stage; the equivalent maximum number of ESSs for a matrix game with four strategies is four, so that the extra structure has made the game more complex. The reason for this is that individuals are allowed to make choices which depend in part on the choice of the opponent, so that effectively the individual has eight distinct choices.

3 Discussion

Matrix games have been used to model a variety of animal behaviours. They are structurally simple and relatively straightforward and thus amenable to analysis, but at the same time provide plausible, if simplistic, explanations for certain behaviours. However, the assumption of constant payoffs independent of time, is not always realistic. Payoffs change throughout the breeding season, and may also vary markedly from year to year. These variations can be critical to any analysis.

In this paper we have introduced the variable payoff matrix A(t) to consider how different time-dependent payoffs may affect strategies. Naturally the evolution of strategies can be more complex than under constant payoffs. In particular just taking a simple time average of the rewards without recourse to the particular time that they are available can lead not just to the wrong result, but completely the opposite result to that obtained by considering the behaviour at each time separately. Similarly if regular behaviour, with a convergent time average, occurs, this does not guarantee that the underlying payoffs have a convergent time-average (or even that the relative sizes of payoffs against the same pure strategy do), so that simple observed behaviour may be concealing very complex variations in the underlying payoffs.

On the other hand, it is demonstrated that provided the variability of the payoffs is severely restricted, then the existence of a convergent time average of the population state occurs as long as there is a level of persistence of the strategies involved in the time average (they do not tend to 0 exponentially). If the actual state converges with time, then the underlying payoffs must converge in a given manner. Thus we extend our knowledge of how certain populations must behave, as well as demonstrating the complexity of the situation by showing that certain plausible statements are false.

The idea of variable payoffs is finally applied to a particular example, that of a two-stage game, where the basic payoffs are actually constant, but the rewards for the first stage (and thus the whole game) depend upon the strategies used, and so are variable. It is shown that if the population states converge in all of the second stage games then the long term time average of the state of the whole population converges provided that the first stage strategies do not tend to 0 at exponential rate (indeed convergence of the state occurs in a wide variety of cases). A numerical example with two strategies at each stage (and so four different plays in total) is given where each subgame has two ESSs (or ESS pairs), and the population converges to one of eight possible final solutions, depending upon initial conditions. This number of possible solutions is larger than for any matrix game with four strategies, the extra complication resulting from each animal being able to use information about the opponent's strategy to choose its strategy.

Appendix A

Proof of Theorem 2

dvidt=vi((Av)ivTAv)
If vivi then dvi/dt0 (since dA/dt is bounded), thus if vi>0 then (Av)ivTAv0(Av)i(Av)j0 i,j s.t. vi>0, vj>0, i.e.,
k(aikajk)vk0k(aikajk)vk0
If vk=0 then vk(t)0 but due to the boundedness of A, vk(t)>0 for all t.
(Av)kvTAv(Av)k(Av)j(Av)k(Av)j,jS(v)
and so (Av)k(Av)j0, since otherwise vk(t) would not converge to 0.

Thus the matrix A becomes arbitrarily close to a matrix which has v as a Nash equilibrium (we represent the collection of all such matrices as A(v)). There are different such matrices which satisfy this, so that which of these A approaches may change with time. The A(v) are split into different families so that if v is a Nash equilibrium of A it is also a Nash equilibrium of μA+L (where LΛ and μ is a nonnegative constant). These families only meet at the set Λ, so that given max|aij(t)akj(t)|>a4 there is a given minimum distance between members of any two families. Since dA/dt is bounded, then A cannot move between families without moving the population state v away from v. Thus A must converge to one such family, i.e.,

Ag(t)A+Λ(t)
for some A which has v as a Nash equilibrium, and some positive continuous function of t, g(t). □

Proof of Theorem 3

dvidt=vi((Av)ivTAv)log(vi(T))log(vi(0))=0T(Av)idt0TvTAvdtlog(vi(T))log(vi(0))log(vj(T))+log(vj(0))=0T((Av)i(Av)j)dt1T0T((Av)i(Av)j)dt0
if both vi and vj do not approach zero sufficiently closely that either 1/Tlog(vi(T))0 or 1/Tlog(vi(T))0 does not hold. We assume that none of the vis do (except any elements that are not involved in the support of v).
1T0T((Av)i(Av)j)dt=1T0T(kaikvkkajkvk)dt=1T0Tk(aikajk)vkdt=1T0Tk(αikαjk)vkdt+1T0Tk(fikfjk)vkdt=k(αikαjk)1T0Tvkdt+1T0Tk(fikfjk)vkdt
Thus TA(v) is a Nash equilibrium of , i.e.,
k(αikαjk)1T0Tvkdt0
if and only if
1T0Tk(fikfjk)vkdt0
|1T0Tk(fikfjk)vkdt|k1T0T|(fikfjk)|vkdtk1T0T|(fikfjk)|dt
Thus if
1T0T|(fikfjk)|dt0
then the time average of v converges as above. □

Proof of Proposition 1

Consider the payoff matrix

|122r(t)|
where r(t) is a step function taking value 2δ a proportion of the time λ and 2k a proportion of the time 1λ, where δ>0 is small and k>1. We also assume that the length of the time spent in each state is long, so that for effectively all of the time spent in each step, the population is at (or arbitrarily close to) the unique internal equilibrium for that level. To satisfy the assumption of continuity and bounded derivative of aij, the steps can of course be ‘rounded’ (e.g., if the end of the step r(t)=2δ occurs at t1, r(t1)=2δ,r(t1+c)=2k,r(t)=1δ(tt1)(kδ)/ct1<t<t1+c) with no extra consequences.

The mean of r(t) is 2λδ+λkk. The time average of the matrix thus gives the equilibrium value of the proportion playing strategy 1 as

v1=kkλ+λδk+1λk+λδ
The equilibrium value of v1 when r(t)=2δ is δ/(1+δ), similarly the equilibrium when r(t)=2k is k/(1+k). Thus the mean value of v1 using the equilibria from the steps is
v¯1=λδ1+δ+(1λ)k1+k=kkλ+kδ+λδ(k+1)(1+δ)
Setting λ=11/k gives
v1=k+δ(11/k)1+k+δ(11/k)k1+k(δ0)1(k)
v¯1=k+δ(k+11/k)(k+1)(δ+1)k1+k(δ0)0(k)

Proof of Proposition 2

Suppose that the payoff matrix alternates between

|241124412|
occurring in intervals (0,T),(3T,7T),,(22m1)T,(22m+11)T and
|214421142|
occurring at all other times. It is easy to see that the time average of aij does not converge for all values of i,j, but that v=(1/3,1/3,1/3) is the internal ESS for each step, and so globally stable under the replicator dynamic [8] so that the time average of v and v itself both converge to v (this is clearly still true if we add any element of Λ or multiply by any function g(t) as defined in the proposition). Note that rounding the step function makes no difference to the convergence of TA(v).  □

Proof of Theorem 4

There are two cases to consider, if i=j and if ij.

(a) ij. The only possible solution here is a pure pair, k1,k2 say. Supposing that the payoff matrices to the two players are B and C, respectively, then for this pair to be an ESS we require

bk1k2>bik2,ik1,ck2k1>cik1,ik2
Let b=bk1k2maxik1(bik2), c=ck2k1maxik2(cik1). Without loss of generality we can assume that bij>0,cij>0 for all i,j. We shall denote the mean population strategies pij,pji by pijT=(x1,,xn), pjiT=(y1,,ym).

Now suppose that xk1>1ɛ1 and yk2>1ɛ2 at time T1 (this must be true for some T1 since convergence occurs).

dxk1dt=xk1((By)k1xTBy)=xk1(bk1jyjxiyjbij)=xk1(bk1k2yk2+jk2bk1jyjxk1yk2bk1k2xk1jk2yjbk1jik1xiyk2bik2ik1,jk2xiyjbij)=xk1(1xk1)[yk2bk1k2+jk2bk1jyj]xk1yk2ik1xibik2xk1ik1,jk2xiyjbijxk1(1xk1)yk2bxk1M(1xk1)(1yk2)=xk1(1xk1)[yk2bM(1yk2)]>(1xk1)b/2
where M=maxi,jbij.

If v=1xk1 then

dvdt<b2vlogv(t)log(v(T1))<b2(tT1)v(t)<Bebt/2
Similarly setting w=1yk2 we can show that
w(t)<Cect/2

Thus

((1Bebt/2)(1Cect/2)bk1k2)<xTBy<(1Bebt/2)(1Cect/2)bk1k2+{1(1Bebt/2)(1Cect/2)}M((1Bebt/2)(1Cect/2)1)bk1k2<fij(t)=xTBybk1k2<{1(1Bebt/2)(1Cect/2)}M1T0T|fij(t)|dt<1T0TM|Bebt/2+Cect/2BCe(b+c)t/2|dt0
as T, i.e., all off-diagonal elements satisfy Theorem 3.

(b) i=j. Suppose that p converges to p on the matrix game defined by the payoff matrix B. Suppose further, without loss of generality, that p is an internal ESS. Define P by

P=ipipi(P=ipipi)
Thus
1PdPdt=ddtlog(P)=ddt(pilogpi)=pipidpidt=pi((Bp)ipTBp)
Let pi=pi+ɛi, so for sufficiently large t, ɛi will be as small as we like for all i. Thus
P=pipi(1+ɛipi)pilogP=log(P)+pilog(1+ɛi/pi)=logP+pi(ɛipiɛi22pi+)logPlogPɛi22pi(ɛi=0)
setting ɛT=(ɛ1,,ɛn),
ddtlogP=pi(B(p+ɛ)i(p+ɛ)TB(p+ɛ))=pTBp+pTBɛpTBppTBɛɛTBpɛTBɛ=ɛTBɛ=bijɛiɛj
(since (Bp)i is constant and ɛi=0).

p is an internal ESS of B implies that it satisfies the negative definiteness condition of [15] so that ɛTBɛλɛi2 for some positive λ for any such ɛis.

Thus we have

log(P)log(P)ddtlog(P)(ɛi2/2pi)ɛTBɛ<(ɛi2)/2mini(pi)λɛi2=12λminpi=g
Thus
log(P)log(P)<g(ddtlog(P))ddt(log(P)et/g)>et/glog(P)glog(P)log(P)ket/g
for some positive constant k. Thus
(ɛi2/2pi)ket/gɛi22ket/g
aij(t)=pTBp=pTBp+ɛibijpj+pibijɛj
so that
fij(t)=ɛibijpj+pibijɛj|fij(t)|2M|ɛi|nM8ket/2g
where n is the larger dimension of B. Thus
T1|fij(t)|dt
is finite and so
1T0T|fij(t)|dt0
thus satisfying the conditions of Theorem 3. □


Bibliographie

[1] J. Maynard Smith; G.R. Price The logic of animal conflict, Nature, Volume 246 (1973), pp. 15-18

[2] J. Maynard Smith Evolution and the Theory of Games, Cambridge University Press, 1982

[3] M. Broom Bounds on the number of ESSs of a matrix game, Math. Biosci., Volume 167 (2000), pp. 163-175

[4] M. Broom; C. Cannings; G.T. Vickers On the number of local maxima of a constrained quadratic form, Proc. R. Soc. Lond. A, Volume 443 (1993), pp. 573-584

[5] R. Selten A note on the evolutionarily stable strategies in asymmetric animal conflicts, J. Theor. Biol., Volume 84 (1980), pp. 93-101

[6] E.C. Zeeman Population dynamics from game theory (A. Nitecki; C. Robinson, eds.), Global Theory of Dynamical Systems, Springer, New York, 1980

[7] J. Hofbauer; K. Sigmund The Theory of Evolution and Dynamical Systems, Cambridge University Press, 1988

[8] J. Hofbauer; K. Sigmund Evolutionary Games and Population Dynamics, Cambridge University Press, 1998

[9] J.M. McNamara; T. Szekely; J.N. Webb; A.I. Houston A dynamic game-theoretic model of parental care, J. Theor. Biol., Volume 205 (2000), pp. 605-623

[10] L.W. Oring Avian polyandry (R.F. Johnston, ed.), Current Ornithology, vol. 3, Plenum Press, New York, 1986, pp. 309-351

[11] A.I. Houston; J.M. McNamara Models of Adaptive Behaviour, Cambridge University Press, 1999

[12] C. Cannings; J.C. Whittaker A two-trial two-strategy game, J. Theor. Biol., Volume 149 (1991), pp. 281-286

[13] R. Cressman Evolutionary Dynamics and Extensive Form Games, Massachusetts Institute of Technology Press, 2003

[14] P. Schuster; K. Sigmund; J. Hofbauer; R. Wolff Self-regulation of behaviour in animal societies, Biol. Cybern., Volume 40 (1981), pp. 9-15

[15] J. Haigh Game theory and evolution, Adv. Appl. Prob., Volume 7 (1975), pp. 8-11


Commentaires - Politique


Ces articles pourraient vous intéresser

Modéliser les interactions moléculaires par la théorie des réseaux de jeux

Matthieu Manceny; Chafika Chettaoui; Michel Malo; ...

C. R. Biol (2006)


A policy iteration algorithm for zero-sum stochastic games with mean payoff

Jean Cochet-Terrasson; Stéphane Gaubert

C. R. Math (2006)


Cooperation strategies, signals and symbiosis

Olivier Perru

C. R. Biol (2006)