Cooperation

We start by introducing a basic distinction:

· An agent who tries to maximize his expected payoff with no consideration at all for those of others is self-regarding.

· An agent who cares not only about one’s own payoff but also takes into consideration that of others is other-regarding.

A self-regarding individual will do whatever is necessary to maximize his own payoff, no matter the consequences for others. For example, he will free ride when possible and generally shirk when advantageous to himself; we all know the type. An other-regarding individual will try to maximize his own payoff but with eye to the payoffs of others. For example, he may not shirk or free ride, and may engage in punishing those who do even if this has a cost to himself; we all know this type as well. (Note that someone who punishes defectors, free-riders, or shirkers, at a cost is other-regarding as is one who helps them at a cost, and consequently being moral is not coextensive with being other-regarding, as one who helps everybody in sight and therefore someone else to commit a crime may be other-regarding).

There are different types of cooperation in relation to the payoffs of the individuals involved in it. If cooperation is the expression of a genotype subject to natural selection, the emergence and persistence of some types of cooperation are unproblematic. For example, if by cooperating X increases its own fitness more than that of others natural selection will favor cooperation.

However, the emergence and persistence of other types of cooperation are difficult to explain. Altruistic cooperation, altruism in brief, is a type of other-regarding behavior that occurs when X increases the average fitness of its group but would increase its own more if it did not cooperate. In other words, by cooperating donor X

confers benefit b on other n member(s) of its group, so that every other member gets a benefit of b/n-1
in so doing pays a cost c itself
b>c

The situation can be described by the Prisoners’ Dilemma. Imagine two players, Joe and Jim. If Joe is other-regarding, Jim gets a benefit b while Joe gets nothing and pays a cost –c (the negative number indicates c is a cost). The same applies to Jim if the situation is reversed. If both are selfish, nobody gets any benefit or pays any cost. If both are other-regarding, then each gets b and pays c. The following matrix represents the interaction.

Jim
Joe		O	S
	O	b -c; b -c	–c; b
	S	b; -c	0

The matrix represents what happens when two players, a row player (Joe) and a column player (Jim), interact. In the rows and columns of the matrix, S stands for being self-regarding and O for being other-regarding, and in the intersection squares the payoffs on the left of the semicolon are Joe’s and the ones on the right of the semicolon Jim’s. For example, if both Joe and Jim are self-regarding then both get payoff 0. If Joe is self-regarding and Jim other-regarding, then Joe’s payoff is b (he gets the benefit of Jim’s altruism without paying any cost) and Jim’s –c. Note that S dominates O in that, no matter what Jim does, it’s always advantageous for Joe to be self-regarding, as b > b -c and 0 > -c. In other words, if payoffs are ultimately proportional to Darwinian fitness (the amount of viable offspring) in the long run free riding beats cooperating, and therefore most evolutionary dynamics will favor shirking (free riding, not-cooperating). In short, free riders will have more babies and eventually altruists will disappear.

However, altruism is widespread, from cells to complex organisms, to societal structures. But why would natural selection favor cooperation (altruism) over defection (self-interestedness) if the payoffs for the former are smaller than those for the latter? The answer, of course, is that under certain circumstances the ultimate payoff for cooperation is in fact greater than that for defection. In what follows we look at some possible mechanisms for human cooperation.

Before we proceed, we must distinguish two types of mechanism, proximate and a remote. The proximate mechanism for cooperation is to be found in the human brain: in some situations, certain parts of the brain are activated leading, or contributing, to certain outcomes. For example, brain scans have shown that when punishing transgressors in most humans areas of the brain responsible for the pleasurable sensation associated with the performance of a satisfactory action are activated; in short, most of us enjoy punishing transgressors. The ultimate proximate explanation of cooperation would be in terms of the biochemistry of the brain and associated beliefs and emotions. The remote mechanism for cooperation is given by our evolutionary history, which made the tendency to punish transgressors an advantageous trait. Note that the two mechanisms are logically independent, although empirically related: the same proximate mechanism is compatible with different evolutionary histories or even with divine design, and the same evolutionary history is compatible with different mechanisms, as different proteins can perform the same biological function.

In what follows we shall only consider remote mechanisms, that is, a remote explanation of human cooperation, more specifically the fact that most humans are prosocial, meaning that they engage in behavior that benefits their group.

Human cooperation is obviously linked to human morality, and here one must be careful. Morality can be thought of as involving two different but related aspects:

Prosocial behavior, namely, behavior that benefits the group.
Thinking of behavior in moral terms and passing moral judgments that are motivating or at least reinforce motivation.

NOTE that (1) does not entail (2): for example, bees are very prosocial, but they don't engage in moral thinking. Whether some apes engage in moral thinking is a complex issue. What is very likely is that other members of the genus Homo (e.g., Neanderthals or Floresiensis) did.

How did prosocial behavior come about?

A reasonable starting point is to consider what our basic psychological features are when it comes to interacting with others. Behavioral game theory tries to determine how people actually behave in strategic situations by using experimental settings. It turns out that when modeling market processes (e.g., supply and demand situations) with clearly defined contracts the assumption that agents are self-regarding by and large leads to correct predictions. However, in social dilemmas (all gain if most cooperate while each has a personal incentive to defect), or in situations involving fairness issues, modeling with self-regarding players is not successful. Instead, experimental evidence shows that most players

value character virtues (e.g., honesty, promise-keeping, fairness)
engage in strong reciprocity, that is in

altruistic cooperation (initially cooperating even at one’s net cost but eventually defecting if others defect first)
altruistic punishment (punishing those who do not display character virtues even at one’s net cost as long as such costs are not too high)

Display some inequality aversion even at a net cost (as long as it isn’t too high), especially if they are victims of inequality.

To understand the evidence, we need to consider some games.

The Ultimatum game: under anonymity, A and B are shown, say, $10. A is told to offer any amount out of the $10 to B. B may accept or refuse. If B accepts, then the money is distributed accordingly. If B refuses, then nobody gets anything. This is a one-shot game, with the assurance that A and B will not meet again.
The Dictator Game: this is like the Ultimatum game with the proviso that B cannot refuse, so that A may keep all the money, if he so chooses.
The Public Goods Game: Groups of two or more subjects are formed. Each member is given a private account P with points (say 10), redeemable at the end for real money. The game has several rounds, each of which consists in the following. Each player can place points in a common account C or keep them in one’s private account. The players are told that the experimenter will give each 40% of the points in C. So, with 4 players, if everyone puts all the points in C, each will end up with 16 points, with a net gain of 6. However, if Joe puts nothing in C while the others put 10 points each, then Joe will end up with 22 points, with a net gain of 12. Hence, if Joe is self-regarding, he’ll contribute nothing. We can see this by using a matrix describing the situation, where:

· in the rows C stands for Joe cooperating (putting his point in the common pot) and D for Joe’s defecting (not contributing to the common pot).

· in the columns C stands for the other 3 cooperating and D for them defecting.

· the numbers in the boxes are Joe’s payoffs.

For example, if both Joe and the others cooperate, Joe ends up with a net gain of 6, which is why the box at the intersection of C and C contains 6.

	C	D
C	6	-6
D	12	0

One can easily see that Joe’s best self regarding policy is to defect every time: if the others cooperate, by defecting he’ll get more than if he cooperates, and if the others defect he would be a sucker if he cooperated. This is a version of The Prisoners Dilemma. Hence, if everybody is self regarding nobody will cooperate.

Here is what the experimental evidence shows.

· One-shot anonymous ultimatum games provide evidence for altruistic punishment. Although what’s considered fair varies from culture to culture, responders punish proposers who have made an unfair offer by angrily rejecting it. This holds also for very high payoffs, comparable to one’s month’s salary. It turns out that in industrialized societies the mean offer hovers about 44% and in small scale societies between 30% and 50% of the amount of money. Significantly, people accept any offer if it is put forth by a non-person, e.g., a computer program or some random device.

· Third parties witnessing the dictator game engage in altruistic punishment if the proposer is judged unfair, as long as the cost is not too high.

· In public goods games, most (but not all!) cooperate initially but end up by defecting in the absence of cooperation if unable to punish free riders (altruistic cooperation); moreover, if given the chance to punish defectors successfully, they do so (altruistic punishment), ultimately achieving very high levels of cooperation.

Finally, anthropological evidence from present hunter-gatherers groups strongly suggests that early Homo sapiens groups were based on rather strict equality (no big chief, strict monogamy, swift punishment of defectors).

Most human beings, then, have very developed prosocial tendencies, so much so that most of us tend to engage in behavior favoring the group even at a cost to ourselves. In a way this is not too surprising, as some such tendencies are also present, but to a significantly lesser degree, in other primates which also display empathy, reciprocity, a sense of fairness, and a tendency to harmonize relationships. Whether such displays are associated with the same emotions we feel is a controversial issue; however, evolutionary parsimony, given the close genealogical links between all primates, suggests homology rather than analogy, and therefore similar emotions.

The fact that prosocial behavior is not unique to us does not explain why we exhibit it. To look for a remote explanation, we must turn to the theory of evolution and to its most abstract version, evolutionary game theory. However, before we look at that, we need to get some idea of our evolutionary lineage to have some sense of the time involved.

Human Evolution

Human lineage is only partially worked out, with many points remaining unclear. For example, we don’t know whether A. afarensis is a true ancestor or simply related to our true ancestor, and how Homo Floresienesis, still alive about 10,000 years ago, is related to us. In fact, many species belonging to our genus Homo existed. Still, here is what is likely to be the basic story:

6 million years ago: split from the ancestral lines leading to gorillas and chimps.

4.4 million years ago: Australopithecus anamensis

4 millions years ago: Australopithecus afarensis (probably not an ancestor of us)

About 3 or 4 feet, bipedal (but not as good as we are) and ape sized brain, about 500cc. (Lucy).

3 million years ago: Australopithecus africanus

2 million years ago: Homo habilis, an early sophisticated stone tool maker. Homo is our genus.

1.5 millions years ago: Split between our line and Homo erectus

1 million/500,000 years ago:

Split between Homo Neanderthalensis and the lineage leading to Homo sapiens (that’s us). The earliest definite anatomically modern human fossils are from Africa, about 100,000 years old.

More evidence is constantly appearing, e.g., Homo Floresiensis. For sure, whatever the vicissitudes of the genus Homo, we are its only non-extinct species. We are genetically more alike than most species, have existed for a relatively short time (200,000 years or so) and for most of it we lived in small bands of hunter gatherers. Complex societies arose only with the domestication of animals and the introduction of agriculture, about 10,000 years ago. Cultural artifacts we take for granted today were invented a very short time ago; for example, writing is only about 5,000 years old, and today’s major religions are even younger.

Evolutionary dynamics

The formal requirements for natural selection (descent with modifications, to use Darwin’s phrase) are simple:

Variation: variations (mutations) arise with sufficient frequency
Replication: individuals must be able to make reasonable copies of themselves, transmitting variations to their progeny
Differential fitness: variations must produce different levels of fitness (ability to produce progeny)

Darwin’s basic idea is that differential fitness will favor some varieties within a species so that they will multiply more successfully than the rest. As a sufficient frequency of variation will keep the process going, the eventual result will be speciation, the splitting of the original species into two, or more, new species.

Note that requirements (1)-(3) are substrate neutral: they apply to RNA based organisms, DNA based organisms, words (Darwin’s example), ideas, rituals, behaviors, artifacts, and so on. Things to which (1)-(3) applies are replicators. When replicators are cultural items, human or animal, they are often called “memes”; for example, birdsongs, tunes, religions, rituals, theories, and behaviors are memes.

Among the desiderata of a model of the evolution of human cooperation, is that it mirror basic human features of hunter-gatherer societies within which we spent 95% of our existence and within which our social evolution took place. A good case can be made that such features are similar to those present in most hunter gatherer societies for which we have anthropological records. Hence, following Gintis and Bowles (2008), the model should take into account crucial features of late Pleistocene (80,000 BCE to about 10,000 BCE) Homo Sapiens societies including

Our tendency to make some errors in the implementation of strategies and in the perception of the situation
A moderately high discount factor, a quantity measuring the propensity to value future goods as much as present ones
The fact that groups are small enough (30 or so individuals) to allow direct observation and yet large enough that free riding may be a problem.
The fact that information may not be public and accurate
The fact that interaction is among both kin and non-kin
The lack of a central authority capable of enforcing social norms.
The fact that status differences are limited when compared to agricultural societies.
The fact that resources are not stored in significant amounts.
The fact that group membership is fluid in that it typically involves exogamy and high levels of intergroup migration
The fact that different groups display behavioral heterogeneity.

Finally, the model must describe as outcome an equilibrium that is

accessible, in the sense that altruism must have a good chance of becoming widespread even if it starts as initially rare
stable, in the sense that once achieved the equilibrium must be robust.

With this in mind, let us look at some proposed models of the emergence and persistence of cooperation.

THE RATIONAL MODEL

This model is based on the strategic interaction among players who are not merely self-regarding but rational as well, as classical game theory assumes. The basic idea is to appeal to repeated games and to Folk Theorems, to which we now turn.

Suppose we repeat a game G, The Prisoners’ Dilemma, for example, an indefinite amount of times. Each round of G is called a “stage” of the repeated game. It turns out that the repeated game has different properties than G; for example, there are Nash equilibria of the repeated game that are not Nash equilibria of G. (A Nash equilibrium obtains when a strategy is a best reply to itself. For example, consider driving on the right side or on the left side of the road. If you drive on the right side, my best strategy is to do the same instead of driving on the left. The same applies to you. So, when all players drive on the right side we have a Nash equilibrium). To understand the import of this we need the notions of discount factor and of signaling.

The discount factor δ is the equivalent in present units of one unit of value to be received one time unit from now. So, in general, to you $1 received one year from now is worth δ of a present dollar; when δ=1 one values goods to be received one time unit from now exactly as much as present goods. Hence, when life is uncertain or the future looks grim the discount factor is low, and in general patient agents act on a basis of a high discount factor while impatient ones on the basis of a low discount factor. In monetary terms, when inflation is high, the discount rate is high and the discount factor small.

In determining the expected payoff of a strategy in a repeated game in which defection matters it is important to have reliable signals telling one whether other players have defected or not. A signal is public if all the players receive it (otherwise it’s private), and it is perfect if it correctly reports whether a player has defected or not (otherwise it’s imperfect).

Consider now the following Prisoners’ Dilemma

Player 2
Player 1		S	T
	S	+5; +5	-10;+10
	T	+10;-10	-5;-5

If the players can use mixed strategies (for example, Player 1 might use S 30% of the times and T 70% of the times), it turns out that any point in the quadrilateral below, where the abscissa represents player 1’s payoff and the ordinate player 2’s, refers to a possible payoff outcome. However, since by defecting each player can guarantee he’ll incur a loss of at most -5, only the points in the quadrilateral ABCD represent strategies with payoffs greater than those resulting from universal defection. As the players are rational, they’ll never settle for any payoff smaller than -5, which entails that only the points in ABCD represent feasible outcomes.

Suppose now that each player is given a list of moves that will result, if both follow it accurately, in an average payoff of a for player 1 and b for player two, corresponding to point P in the graph. Then, player 1 could set up the following strategy:

“As long as player 2 follows the list, then follow the list as well; however, if 2 deviates then maximally punish him (in this specific case, defect) forever after”.

(This type of strategy is called a “trigger strategy”)

Imagine now that player 2 follows the equivalent strategy. Clearly, the mixed strategies leading to P, constitute a Nash equilibrium (they are best replies to each other), as any deviation from them will result in the application of the trigger strategy which will assure that the deviator (and everybody else once the deviator retaliates) will get -5 ever after. Importantly, there are three assumptions at work:

The discount factor is sufficiently high (the players care enough about their future payoffs for the trigger strategy to work).
Defection signals are public
Defection signals are perfect.

Two things are worth noting. First, in any game a player can always play a strategy that inflicts maximum losses to his opponents; similarly any player can follow a minimax strategy, namely, one that minimizes his losses. (Here -5 is the minimax to which the original deviator may be pushed). Hence, the above argument is general; second, all of this applies to many-players games as well. Since P can be any point whatever in ABCD, we have the most basic version of the Folk Theorem:

If (1)-(3) obtain, then any point in ABCD (any payoff outcome in ABCD) can be reached by a Nash equilibrium in the repeated game.

So, when the Folk Theorem applies, saying that a certain outcome is a Nash equilibrium is not saying much, as just about any outcome can be a Nash equilibrium. In short, when (1)-(3) apply, Nash equilibria are, as it were, cheap. The theorem can be extended to many cases of public but imperfect information, and even to cases of private, but almost public, information.

In providing a genesis of cooperation one must show that the relevant equilibrium is both attainable and stable. At first inspection, Folk Theorems seem to do this straightforwardly as any feasible payoff above the minimax is a Nash equilibrium and a Nash equilibrium is, in a way, self-fulfilling in the sense that if others stick to it, then it’s to one’s advantage to do the same. However, there are serious problems.

Let us start with the attainability requirement. Since there is an infinity of Nash equilibria, how do separate individuals coordinate to settle for one? The easiest way is to assume that coordination rules are already present as social rules. In our example, the players were given a list of moves. But obviously this will not do as it posits what needs to be explained. Moreover, social rules are often broken when contrary to immediate self-interest, and therefore they are discretionary unless enforceable, which already presupposes social coordination. Of course, one might argue that the enforcement (through punishment) is carried out by individual members without any need to coordinate because it is advantageous to the enforcer, for example by withdrawing cooperation (a form of punishment) and therefore reducing the cost associated with cooperation. However, this solution is problematic because if punishment is advantageous to the punisher, then why punish only rule breakers? And if it is disadvantageous, why punish at all? It seems we are back in the swamps of the Prisoners’ Dilemma, where defection, free-riding, dominate. A solution might be the introduction of bargaining, but this seems already to presuppose coordination rules.

Another issue is whether the cognitive requirements of the Folk Theorems can be realistically satisfied. Public information can be achieved either in very small groups where everyone sees what everyone else does, or with an information distribution system in larger groups. But the presence of such system already presupposes considerable levels of coordination.

Finally, evidence from behavioral game theory shows that the assumption that humans are self-interested is mistaken; this, in turn, eliminates the need to use self-interested individuals in the models for the development of altruism.

POSITIVE ASSORTMENT MODELS

In standard replicator dynamics, interactions are random. For example, if 20% of the group members are self-regarding, then any given individual will interact with a self-regarding member 20% of the times. However, when it comes to human interaction, such a requirement seems implausible. Hence, many models try to explain the emergence of other regarding behavior by appealing to positive assortment: those with a tendency to be other-regarding interact with each other more frequently than by mere chance. Here are some interesting cases.

Kin altruism

If I increase my identical twin’s fitness at a cost to me, my altruistic genes will be transmitted, through my twin, to the next generation. So, if I behave altruistically towards my kin, we have a case of positive assortment. The key equation here is Hamilton’s rule, which states that kin cooperation is favored by natural selection if the genetic relatedness r (1/2 for siblings, 1/4 for nieces and nephews, and so on) between donor and beneficiary exceeds the cost-benefit ratio of the altruistic act:

r > c/b.

Note that the value of r is unlikely to be high in a population that is not highly inbred. In short, although there is no question that kin altruism is a force in evolution, it cannot explain the simple fact that among primates, and especially humans, altruistic cooperation extends well beyond kin, at times even trumping it.

Reciprocal Altruism

If two individuals are randomly paired to play the Prisoners Dilemma for many rounds, then cooperation becomes probable if a strategy of reciprocal altruism is introduced. The idea here is that X cooperates with Y if Y has cooperated with X, and vice versa. This is an example of positive assortment in that cooperators tend to cooperate with other cooperators more than with the generic player. The simplest of these strategies is Tit-For-Tat (TFT), which says to cooperate if the other player cooperated in the previous round and defect if the other player defected in the previous round: TFT has a short memory. Although the evidence for reciprocal altruism outside humans is relegated to other primates, there is no doubt that it played an important role in human interaction, as food sharing in our ancestral past was probably network-based rather than common pot based; in other words, one primarily shared with the individual who had previously shared with one. It turns out that if a few TFT’s are present, TFT beats “Always Defect”, and therefore produces an accessible equilibrium. There is evidence that it is not stable, as TFT is itself supplanted by “Generous TFT” (GTFT), a strategy involving some degree of forgiveness towards shirkers, which becomes stable if it is not too generous. (Of course, what counts as too generous depends on the payoff matrix). However, computer models show that TFT is not accessible and/or not stable when interactions involve more than two players (typically, six players are enough to reduce cooperation drastically) unless there are no mistakes and information is public and accurate, each of which is an unrealistic requirement.

Indirect Reciprocity

Indirect reciprocity is based on reputation. If X benefits Y, then X has a greater chance of being benefited by Z than if he benefits nobody. Positive assortment comes about because those with good reputation will cooperate among each other more than those without it. A strategy embodying indirect reciprocity is the good standing strategy. Players who cooperated with others in the past are in good standing; otherwise they are in bad standing. The strategy is to cooperate only with those who are in good standing if one is in good standing and cooperate unconditionally if one is in bad standing due to a previous mistake so as to reacquire good standing. Under incarnations of this model involving random interaction, indirect reciprocity will succeed if the probability p of knowing the score of another player exceeds the ratio between cost and benefit:

p> c/b.

This information requirement is high, and exceedingly difficult to satisfy if interactions involve more than a few players. Although the use of language may facilitate the attainment of the relevant information, the strategy based equilibrium is unstable if errors are allowed and information is imperfect, as it is bound to be given the incentive to convey false information. There is, however, a way to overcome the information problem in a large group if one engages in costly signaling, namely signaling that cannot be faked. For example, an act of bravery or a public sharing of food or some ritual scarring may increase one’s reputation. So, indirect reciprocity can produce a stable cooperative equilibrium in small groups where everyone knows all the relevant information about everyone else –a fact that can be plausibly modeled with various types of spatial games, or when costly signaling is present.

Standard Group Selection Models

The idea here is that although cooperation lowers one’s fitness within the group, it sufficiently increases the group’s average fitness with respect to the population to render the group successful. Groups are reproductively isolated but individual reproduction is related to payoff, so that individuals with low payoffs are eliminated while those with high payoffs replicate. Obviously, the more internally homogeneous groups are, the more cooperators will increase; for example, individuals in a group made of all cooperators will have much above average payoffs. The destiny of cooperators depends on whether

Pr(C|C) – Pr(C|~C) > c/b,

Pr(C|C) – Pr(C|~C) = c/b,

Pr(C|C) – Pr(C|~C) < c/b,

where Pr(C|C) stands for the probability that one interacts with a cooperator given than one is a cooperator, and Pr(C|~C) is the probability that one interacts with a cooperator given that one is not a cooperator. In the first case, cooperation will increase, in the second it will remain stationary, and in the third it will decrease. If cooperation is genetically based, the quantity

F=Pr(C|C) – Pr(C|~C)

is Wrights’s inbreeding coefficient, which measures the level of genetic differentiation among the groups and also the degree of positive assortment. It turns out that the evidence from foraging populations indicates that F is quite low, in the order of 1/12, which would require that benefit b must be at least 12 times greater than cost c, a condition too stringent to make the emergence of cooperation likely. Differently put, high levels of migration among groups impede group selection mechanisms.

Networks

In human societies, individuals are part of networks so that they interact only with members to whom they are directly linked. In other words, each individual interacts only with a group of neighbors. After each stage of the game, individuals update their strategy and adopt the strategy C of neighboring cooperators with probability equal to the sum of the payoffs of all the neighboring cooperators divided by the sum of all the payoffs of the neighboring non-cooperators. Nowak and others have shown that cooperation will increase if

1/k > c/b,

where k is the average number of neighbors an individual has. This obtains because the smaller the neighborhoods, the more likely that they are different from each other, thus effecting positive assortment. This means that the smaller the average neighborhood, the greater the chance of an expansion of cooperators. The problem for this model is that in foraging societies the whole group, typically about 30 or so individuals, often constitutes the neighborhood, which requires c/b to be too small to be realistic.

The Bowles and Gintis group selection model

Public goods games, and common experience, indicate that if free-riders are allowed to proliferate, cooperation collapses, eventually degenerating into equivalents of the tragedy of the commons. The solution to this state of affairs has traditionally been positing some central authority that sanctions defectors, thus insuring cooperation. The reason is simple: since defectors do better than cooperators, how could cooperation evolve unless an enforcer was present? Still, history is replete of societies in which cooperation was achieved and maintained without any central authority because of the propensity of most of their members to engage in altruistic punishment. So, altruistic punishment seems the answer to our question. However, since altruistic punishers do worse than non-punishers, how did they evolve? Gintis and Bowles (2008) have constructed a model that satisfies the conditions set above concerning late Pleistocene human societies and has as an outcome the emergency and persistence of altruistic punishment. Here is its rough outline.

Every period, a cooperating individual produces a benefit b, shared by all, at a personal cost c, so that if all cooperate each will obtain b-c>0. Individuals are haploid (an individual has one copy of each gene) but reproduction is diploid (an individual inherits each gene from one of its parents). Individuals have two relevant loci: one determines whether the individual is selfish or cooperative and the other whether the individual is a punisher or not. Here is how they behave:

Cooperators always cooperate even if taken advantage of.
Selfish types do not cooperate (they shirk) unless sp>c where s is the punishment imposed by punishers for shirking and p is the probability of being punished. p is proportional to the frequency of punishers. In short, selfish types maximize their own expected payoffs.
Punishers punish shirkers unless the ratio of selfish to punishers is larger than n_max. Crucially, the cost of punishing a shirker, c_p is shared by all punishers, so that the more punishers, the less the cost of punishing to each individual punisher. Note that this cost sharing presupposes some degree of coordination.
Non-punishers never punish.

Punishers do not punish non-punishers unless they do not cooperate, which means that second-order shirking is not punished. Because of diploid reproduction, four types are possible:

cooperator-punishers
selfish-punishers
cooperator-nonpunishers
selfish-nonpunishers.

Both selfish types and punishers, and therefore every individual, make mistakes with probability ε; the mistakes arise from bad information about who is shirking, miscalculation of expected payoffs, and execution mishaps. Mutation occurs from selfish to cooperator and vice versa and from punisher to non-punisher and vice versa with probability μ/2. Groups are in a Moore neighborhood with no boundary effects (in the model, groups are on a torus). Individuals migrate with migration rate m and go to a neighboring group; mates are always from a neighboring group. When a group becomes smaller than n_min, it recruits members from neighboring groups. At the end of each period, after punishment has possibly occurred, a fraction r of each group reproduces as follows. An individual A is randomly chosen and reproduces with probability proportional to its payoff by mating with B, randomly chosen from a neighboring group. The single offspring is randomly assigned to A’s or B’s group, and one individual in the population is randomly killed, so that the size of the population is constant. Because of the relation between periods and generations –25 periods per generation, one period roughly corresponds to one year.

The simulation has the following features:

In the initial total population, 50% are selfish, 50% cooperators, and 100% non-punishers.
In addition, b=2, c=1, c_p =1, s=2, ε=0.015, n_max=1.7, n_min=4.

What happens is this. Through mutation, punishers appear and eventually by random drift become sufficiently frequent in one group to compel selfish shirkers to cooperate. This produces two outcomes. First, punishers do not need to punish much, with the result that they do only marginally worse than non-punishers. Second, the average fitness in the group increases, so that the group grows in size and eventually seeds other groups by migration and repopulation of undersized groups. In simulations with 1000 groups, at about 15,000 periods there is an explosion of punishers; by about 20,000 periods over 90% of the population is made up of punishers, about 85% by selfish, about 15% by cooperators, and the rate of shirkers is about 15%. These outcomes are stable even if they are not asymptotically stable, as the above mentioned values have small oscillations. Interestingly, some shirkers in the population are necessary for the long term stability of the outcome, otherwise non-punishers will do better than punishers, thus taking over just to be replaced by selfish shirkers. The key to the success of punishers is the fact that the cost c_p for punishing an individual is shared among the punishers, a fact that requires the capacity

· to harm easily

· to transmit information

· to coordinate complex action,

a characteristic especially developed in humans given our ability in handling and throwing projectiles, and communicating through language. These requirements may also explain why altruistic punishment is rare in other species. Note that miscommunication, false information, execution errors and similar mishaps occur 1.5% of the times in the model.

The above results are robust as long as c<b/2 and s>b. Mutation rates do not affect the outcomes much; surprisingly, migration rates up to 50% per generation have little impact. Since the expansion of altruistic punishers is tied to the expansion of their groups, if absolute resource constraints on group size are implemented (no group can increase in size and intra-group resource competition occurs), no evolution of altruistic punishment occurs. However, if 1 indicates absolute resource constraint and 0 none, about 50% of the population is still constituted by punishers even with resource constraints of 0.6, which seems to indicate that under realistic resource constraints altruistic punishment would evolve.

Some remarks on group selection models

Group selection models for the evolution of altruistic cooperation are based on three ideas:

cooperators make their group successful against other groups
cooperators are less successful than non-cooperators within their own group
the increase in fitness arising from belonging to a successful group (call such increase I) is greater than the decrease in fitness arising from cooperating within the group (call such decrease D).

Obviously, anything that makes I larger and D smaller contributes to the success of the model. Two things are worth noting.

· The leveling of intra-group fitness differences can come about through various conventional cultural institutions such as monogamy and food sharing, both observed in foraging societies. These are conventions, that is, Nash equilibria, and therefore best replies to themselves. Further, some type of selective assortment may occur, for example, by cooperators expelling or even killing non-cooperators.

· Scarcity of resources and subsequent conflict among groups tend to make I larger. There is evidence from prehistoric burial sites that inter group warfare was common, a state of affairs possibly caused by high climate variability during the last 100,000 years causing high levels of population displacement. Bowels and Gintis have constructed models for the emergence and persistence of altruistic parochialism, the tendency to help members of one’s own group at a cost and to display hostility to others, a feature commonly present in humans.

With that in mind, can an account (rather speculative, to be sure) of morality and religion be provided? As with the origin of prosociality, one must keep in mind ultimate and proximate explanations. The former involve an evolutionary account; the latter a neurological and cultural account. As before, here we totally neglect proximate explanations.

When thinking about an evolutionary account of morality, several possibilities come to mind. Morality could be

1. An adaptation beneficial to

a) Individuals, primarily

b) Groups primarily, and secondarily to their members vs. members of other groups

2. A neutral trait, like hair color.

3. A byproduct of an adaptive trait, a spandrel, in Gould’s terminology.

4. A cultural parasite, a bad meme gone wild, like a bad tune everybody is whistling.

Some evidence for the view that (1b) is a good bet. Morality

· Reinforces pro-social tendencies in the subject, and therefore cooperation in society.

· Reinforces strong reciprocity in society by providing the conceptual tool for punishment because morality makes intersubjective claims.

· By the use of moral language, which makes moral communication rapid and efficient, makes punishment almost cost-free. Hence, it extends the size of societies in which pro-social behavior is fitness enhancing.

So, morality seems to increase intra-group leveling, thus diminishing the relative unfitness of strong reciprocators. Note that whether moral judgments like “murder is wrong” are true or not need not affect the functional aspects of morality.

Providing an evolutionary account of religion is an enormously complex task, involving, among other things, little understood areas like late Paleolitic (alleged) religion, and new fields like cultural evolution and the neurology of religion. These problems are magnified by the fact that the discipline has had a revival only in the last 20 years. Hence, what follows is speculative, although somewhat (how much is the issue) justified.

If the evolutionary story given above is correct, the idea that religion primarily benefits groups and secondarily of their members is the most likely one. If group selection has been at work in the rise of altruistic punishment, then practices and institutions that favor in-group leveling have likely been selected. One can then speculate that the evolutionary cause for the existence of religion is tied to the functional role of religion in the reinforcement of morality, whose role, in turn, is to reinforce prosocial behavior. In short, human religion exists because of human prosocial and moral behavior, not the other way around. This presupposes that successful religions, and therefore theistic based religions --although religion need not be theistic, presumably, most theistic beliefs have been associated with religions-- share a moral core conducive to a type of in-group morality favorable to prosociality. Typically theistic based religions preach helpful attitudes towards correligionaries, some sort of supernatural punishment for transgressors, all factors favoring leveling by providing prosocial inducements to self-interested members of the group. In addition, parochialism, and group-supporting practices such as meetings and rituals favor group cohesion by providing psychological support and costly signaling (e.g., scarring, time-consuming rituals, resource-intensive religious practices).

Presumably, the evolutionary advantage of religion became very significant once humans started living in large groups such as large bands or tribes (in the high 100’s) containing significant inequalities together with a significant increase in the production of offspring. In particular, the neolithic domestication of crops and animals about 10,000 BCE coincided with the introduction of even larger sedentary communities (in the 1000’s) characterized by theocratic government that justified political authority, unequal distribution of wealth, and parochialism.

(For a specific evolutionary analysis of religions as fostering group cohesion, see D. Wilson’s Darwin’s Cathedral).