11

Reinforcement and selection

The emphasis on choice as the basic building block of explanation in the social sciences goes together with an emphasis on the intended consequences of action.1 In this chapter, I discuss explanations of actions in terms of their objective consequences. This might seem like an unpromising idea. All explanation is causal explanation. We explain an event by citing its cause. Causes precede their effects in time. It follows that we cannot explain an event, e.g. an action, by its consequences.

If, however, the explanandum is a pattern of recurrent behavior, the consequences of that behavior on one occasion can enter into the causes that make its occurrence on a later occasion more likely. There are two main ways in which this can happen: by reinforcement and by selection. I shall focus on the second, which is the more important for my purposes, but begin with some words about the first.

Reinforcement

If the consequences of given behavior are pleasant or rewarding, we tend to engage in it more often; if they are unpleasant or punishing it will occur less often. The underlying mechanism could simply be conscious rational choice, if we notice the pleasant or unpleasant consequences and decide to act in the future so as to repeat or avoid repeating the experience.2 Often, however, the reinforcement can happen without intentional choice. When infants learn to cry because the parents reward them by picking them up when they do, there is no reason to think that they first consciously note the benefits from crying and then later cry at will to get them. When older children throw a tantrum to get their way, parents can usually tell that it is not a genuine one.

Reinforcement learning has been extensively studied in laboratory experiments on animals. One typically offers the animal the opportunity to press a lever, or one among several levers, and rewards the presses either as a function of the number of lever presses since the last reward or as a function of the time passed since the last reward. In either case, the function can be deterministic or probabilistic. In fixed-ratio schedules, the animal receives a reward after it has pressed a lever a fixed number of times, whereas in variable-ratio schedules the number of presses needed to produce a reward varies randomly. In either case, each press produces a “reward point” that is added to previous points. In fixed-interval schedules a press will produce a reward a given time after the last reward was offered, whereas in variable-interval schedules the time before a new reward is made available varies randomly. In either case, the timing of the reward is independent of the number of presses. Each schedule of reinforcement produces, after some learning time, a specific and stable pattern of behavior, which, moreover, will be extinguished in a specific pattern once the reinforcer (the reward) is removed. For instance, responses that are learned by rewarding every lever press (a special case of a fixed-ratio schedule known as continuous reinforcement) are extinguished more quickly than those learned on a random variable-ratio schedule. Intuition might suggest the opposite, since continuous reinforcement would seem to produce a stronger habit, but, as sometimes happens, intuition is wrong.

The relevance of these findings outside the laboratory depends on the purpose. If the aim is to shape action, for example, in a classroom situation, in a gambling casino, or in the workplace, a designer may (more or less freely) impose a reward schedule to generate desired behavior. For instance, variable-interval schedules are often used to shape behavior, as when a teacher announces a policy of random quizzes. On the variable-ratio schedule that operates in many gambles, it is easier to establish the behavior if the first reward occurs early on.3 As casino and race track managers lack the technology for sucking in novices by offering them big wins, they have to rely on the fact that by the laws of chance some gamblers will have beginner's luck.4 Con-man operations, however, often rely on the deliberate inducement of early wins by the mark. In the classroom and the casino, the reward schedules operate “behind the back” of the students or gamblers, in the sense that they do not shape the behavior by explicit incentives but rather, as in the case of the crying infant, by an unconscious process. By contrast, when managers pay employees once they achieve a set target (a fixed-ratio schedule) or on a monthly basis (a fixed-interval schedule), they are simply setting up an incentive system (Chapter 25). Since the behavior of the employees can be adequately explained by the expected reward, there is no need to appeal to actual reward.

If the aim is to explain behavioral patterns by their actual consequences, the reward schedules are relevant only if they occur naturally and, moreover, are so opaque that they do not create explicit incentives. This does not often seem to happen with the two fixed schedules. In everyday life, the sheer number of responses is rarely decisive for reward. It is not the number of friendly smiles I give my friends that shapes their behavior toward me, but the consistency and the appropriateness of my smiling. In natural settings, rewards that arrive every so often, such as my paycheck, are rare. The two variable schedules are more important. A person who plays “hot and cold” (a variable-ratio schedule) with a member of the opposite (or the same) sex may induce a stronger attraction than someone who invariably displays friendly behavior. A variable-interval schedule arises when you try to reach someone on the phone and the line is busy. You know that sooner or later you will get through by redialing, but you do not know when. This situation induces a pattern of steady redialing that would not be the unique prediction of rational-choice theory. That theory could predict any number of patterns, depending on the caller's beliefs about how long the conversation of the other person is likely to last. It seems unlikely, however, that people have stable beliefs about such matters.

The response pattern generated by reinforcement is not, in general, the one that would be produced by conscious, rational choice. Suppose for instance that an animal has the choice between pressing either of two levers, one that rewards on a variable-ratio schedule and one that rewards on a variable-interval schedule. The rational pattern, which will maximize overall reward, is to press the variable-ratio lever most of time, to accumulate reward points, while visiting the variable-interval lever from time to time to see whether a new reward has become available. This is not, however, the pattern produced by reinforcement learning. Instead, the animals press the variable-interval lever much more often than is optimal. In doing so, they equalize the average rewards to pressing the one or the other lever rather than, as rationality would dictate, equalizing the marginal rewards. For other schedule combinations, reinforcement learning sometimes mimics rational choice, but not in any consistent manner. If there is any non-intentional mechanism capable of reliably simulating rationality, we shall have to look for it elsewhere.

Natural selection

In most of this book I discuss how we can explain behavior by assuming that agents adapt to their environment, in a more or less rational manner. In a radically different perspective, we may try to explain behavior by assuming that agents are selected by the environment. Although selection can be the work of an intentional agent, as when domestic dogs are bred to be docile or laboratory rats to become more intelligent, many selection mechanisms rest on causal processes that involve no intentional agent. In particular, differential survival of organisms based on their behavioral patterns may lead to optimal behavior (optimal for reproduction) in the population even in the absence of any optimizing choices or intentions. Suppose that 10 percent of the organisms in a population of 100 organisms forage so efficiently that they leave on average 10 offspring that survive to adulthood, whereas the remaining 90 percent leave only 5. If the behavior of the parents is (by whatever mechanism) transmitted to the offspring, the next generation of adult organisms will include a fraction of 100/550 = 2/11 ~ 18 percent that displays the more efficient behavior. Over the course of a few more generations, virtually all organisms will display it. If we ask why it is universally displayed, the answer is that it has better consequences. This mechanism works across generations. Unlike reinforcement learning, it does not modify the behavior of any given individual, only the typical behavior of successive generations of individuals.

This story does not say why some organisms forage more efficiently than others. The generally accepted answer is that changes in the structure or (as in this example) the behavior of organisms are due to mutations in the genetic material that occur when the gene is copied from one generation to the next. The mutations have four important properties. First, they arise more or less at random. Although mutagens can increase the frequency of mutations, just as a poor-sighted typesetter will make more mistakes, there is no mechanism that would favor beneficial or useful mutations, any more than the mistakes of an ignorant typesetter could systematically improve the text he is setting. Second, most mutations are in fact deleterious, that is, the protein for which the mutated gene codes reduces reproductive fitness. Most mistakes by a typesetter also create confusion. Third, mutations are typically small, involving a substitution of one “letter” (one of four nucleotide molecules) in the genetic code for another. A mistake by a typesetter, too, often involves the replacement of one letter by another, if for instance she substitutes “366 days” for “365 days.” Fourth, some mutations do improve fitness. In a leap-year, the typesetter's mistake might yield an improvement. Natural selection is based on happy accidents of this kind.

Because mutations are small, natural selection is constrained to take one step at a time. Each step has to be viable, since otherwise the organism in which the mutation occurred would not leave any descendants. Natural selection (in this classical picture) is constrained to small incremental improvements. The organism climbs along a fitness gradient until it reaches a local maximum, defined as a state in which all further one-step changes would reduce fitness. Although there may be higher peaks in “the adaptive landscape,” these will not be attainable by one-step changes.

This process differs from intentional choice in three respects. Recall from Chapter 6 that by virtue of their intentionality, human beings are capable of (1) using indirect strategies, (2) waiting, and (3) aiming ahead of a moving target.

Concerning (1), natural selection cannot reculer pour mieux sauter, since an organism that took one step backward (a deleterious mutation) would not leave offspring in which a favorable mutation (two steps forward) could occur. Figure 11.1 illustrates this case. A direct move from A to C is impossible, because it would require the simultaneous mutation of two nucleotides. While a one-step move from A to B is possible, the further one-step move from B to C is blocked by the fact that an organism in state B would not leave any descendants to take the second step.

Figure 11.1

Concerning (2), natural selection cannot wait, that is, refuse a favorable mutation from A to B if it would preclude a move to a global maximum C that can be reached from A, but not from B. Figure 11.2 illustrates this case.

Figure 11.2

Concerning (3), populations are adapting to an environment that is constantly changing. If the changes are regular, for instance, seasonal or diurnal, they adapt to the changes. If a one-shot event occurs, such as a sudden climate change, behavior that was at a local fitness maximum prior to the change may become suboptimal, so that mutations that previously would have been deleterious are favorable. If the change is protracted, as when the climate cools or warms over a long period of time, this process may never reach a new local maximum. The population will track the changes in the environment with an efficacy that depends on the relative speed of the two processes. The amazingly fine-tuned adaptations observed in animals and plants suggest that animals adjust to the environment much faster than the latter itself changes. Yet the organisms will always lag somewhat behind, since they cannot anticipate changes in the environment. By contrast, human beings may become aware of future changes such as global warming and take precautions against them before they actually occur or, if they result from human behavior, prevent them from occurring.

The environment of a population is made up, among other things, of populations from other species to which it may stand in a prey-predator relation. As prey, it may evolve better evasive strategies; as predator, better hunting strategies. Just as the individual fox and hare are chasing each other across the fields, so do the species Fox and Hare chase each other over generations. But whereas the logic of natural selection precludes the Fox from anticipating where the Hare is going to be a few millennia hence, some predators are able to intersect the flight path of the prey (Figure 6.1). Similarly, the locally maximizing process of natural selection has produced the capacity for global maximization found in human beings.

This classical picture of natural selection is simplified in a number of ways. First, some behavioral patterns are transmitted over generations by culture, not by genes. Although the theory of “memes” as a cultural analogue of genes is too poorly specified to be useful, there exist persuasive models that explain, for instance, why women in some countries have fewer children than the biological maximum that would be favored by selection operating on genes alone. In contemporary Italy, the average number of children per woman is about 1.4. While the biological maximum is hard to define, it is certainly much higher. Second, large mutations do occur, and some of them may be responsible for developments that would not have been possible through small point mutations. Also, inferior forms are not eliminated instantly by competition. In Figure 11.1, the mutation to B does not necessarily produce an organism that is unviable in the strict sense of being unable to survive or to reproduce. Some organisms in state B might survive to produce organisms in state C. In Figure 11.2, some organisms in state A might survive the competition from the more efficient organisms in state B long enough for a mutation to state C to occur. Whether the global maximum occurs is a matter of the relative speed of two processes: the extinction of inferior varieties and the rate at which favorable mutations occur. There is no mechanism, however, that could mimic, in a systematic way, the capacity of intentional beings to preempt, to wait, or to use indirect strategies.

The units of selection

Natural selection is not only opportunistic and myopic, but also, with two exceptions I shall discuss shortly, fiercely individualistic. It does not favor the species or the population, but the individual organism. If a property arising from a mutation increases the relative fitness of the organism, it will be fixed in the population even if it also causes a decrease in absolute fitness. Imagine a population of fish, exposed to predators, initially swimming in scattered schools. If a mutation causes the fish in which it occurs to move to the center of the school, it will be less vulnerable to predation and as a result will tend to leave more offspring. As this behavior spreads in the population, the school will become more compact and thus an easier target for predators. At each step in the process it is better to seek the middle than to be at the outskirts of the school. Yet in terms of absolute fitness the outcome is worse for all than the initial situation, and in terms of relative fitness it is unchanged. Similarly, runaway sexual selection is a plausible explanation of the large and dysfunctional antlers found in some species of deer.

One exception to individualism is kin selection (a form of “subindividualism”), in which the gene rather than the individual organism is the unit of selection. The choice of unit does not matter when the effect of a gene simultaneously and in the same proportion increases the presence of the gene in the population and the number of offspring of the organism displaying that behavior. This is the case, for instance, in the evolution of more efficient foraging. But in some cases the gene can benefit even if the organism in which it triggers the behavior does not, namely, when an organism “sacrifices” itself for the sake of close relatives who are likely to have the same gene. When an animal observes a predator and emits an alarm signal, its chances of survival often go down while those of close relatives in the vicinity may go up. Since these relatives or some of them will also have the “warning gene,” their higher survival chances may cause the gene to spread in the population if they more than offset the lower chances of the animal that emitted the signal.

Another exception is group selection (a form of “supraindividualism”). Consider two populations of fish, one in which the center-seeking mutation has occurred and one in which it has not. Over time, the former will leave fewer offspring than the latter and might ultimately be crowded out. Selection would seem to operate at the level of the group, not of the individual. Yet if the two populations coexist, the second is vulnerable to invaders from the first. Whether the center-seeking behavior is caused by mutation or by in-migration, the outcome is the same, namely, the crowding out of those who do not behave in that way. Similarly, if organisms in a population have a gene that prevents them from overgrazing, thus avoiding “the tragedy of the commons,” they might be out-reproduced by less inhibited organisms that lack the gene. For this reason, group selection has not been seen as a plausible mechanism for generating cooperation or self-restraint. In the light of the theory of altruistic punishment set out in Chapter 5, however, this objection can be met. If the organisms in a population have a gene that make them punish non-cooperators, the latter will not gain any reproductive advantage from their free riding.5

Kin and group selection provide two mechanisms for the emergence of cooperative behavior, the former based on shared genes and the latter on altruistic punishment. A third mechanism is that of reciprocal altruism or “tit-for-tat” in repeated interactions, such as “I scratch your back; you scratch mine” (among some animals quite literally) or “I offer you food when I have a surplus; you offer it to me when you have.” The other side of the coin is punishment, or at least abstention from cooperation, when the other party fails to reciprocate. For this mechanism to operate the individuals must interact often enough to make self-restraint worthwhile, remember what others did on earlier occasions, and recognize them when they meet again.

Natural selection and human behavior

Natural selection has obviously shaped the physical structure of human beings, which offers them opportunities for action as well as constraints on action. Those who try to explain human behavior in terms of natural selection sometimes make stronger claims. They want to explain the behavioral patterns themselves, not merely the structures that make them possible.

The most plausible mechanism is that evolution has produced emotions with their characteristic action tendencies. Since a male can never be sure whether he is the father of his offspring whereas the female is not in doubt that she is the mother, we would expect natural selection to produce a stronger tendency to feel sexual jealousy in men than in women. This is confirmed by many homicide statistics. Thus among 1,060 spousal homicides in Canada between 1974 and 1983, 812 were committed by men and 248 by women, but among those motivated by jealousy 195 were by men and only 19 by women. The theory of natural selection also predicts that parents would be more emotionally committed to their biological children, who are carriers of their genes, than to stepchildren. This prediction, too, is confirmed by data. Thus an American child living with one or more substitute parents in 1976 was about 100 times as likely to be abused as a child living with two natural parents. Natural selection may also favor lack of emotion. The dangers of inbreeding are kept in check, in humans and in other primate species, by lack of sexual attraction among young who grow up together, whether or not they are related to each other.6

Natural selection, operating on groups rather than on individuals, may also have favored emotions of anger and indignation toward those who violate norms of cooperation, motivating punishment even at some cost to the punisher. A more puzzling question is whether and why selection might have favored the emotion of contempt, which is directed toward those who violate social rather than moral norms. Since many social norms are arbitrary and even dysfunctional, it is difficult to see how they could be sustained by group selection. Given a tendency for others to ostracize those who violate social norms, reproductive fitness would be better served by respecting the norms if the cost of ostracism exceeds the benefit derived from the norm violation. The puzzle is why this tendency would arise in the first place. Why, for instance, would people disapprove of adultery? Social norms against adultery involve third-party reactions that differ from the second-party reaction of sexual jealousy. Although A might benefit from C's disapproval of B's advances to A's spouse, that benefit does not create a selection pressure on C to behave in this way. Whereas group selection might favor genes that induce “third-party punishment” of free riders, the benefit to the group of third-party punishment of adultery is less obvious. Although the tendency for norms against female adultery to be stronger than those against male adultery suggests an evolutionary explanation, it is hard to see what the mechanism would be.

Other claims are more speculative, such as the idea that self-deception in humans evolved because of its evolutionary benefits. The argument goes as follows. It is often useful to deceive others. However, deliberate or hypocritical deception is hard to carry off. Therefore, self-deception evolved to enable people to deceive others successfully. The weakness of the argument is that if self-deception causes one to hold false beliefs, these might have disastrous consequences if used as a premise for behavior. Nobody has made a convincing argument that the net effect of these opposing effects tends to be positive, as it would have to be for self-deception to enhance evolutionary fitness.

Even more speculative is the claim that unipolar depression may have evolved as a bargaining tool, somewhat similar to a labor strike. For instance, an alleged function of postpartum depression is to induce others to share in raising the child, just as workers go on strike to make employers share the profits. Suicides induced by depression are, on this view, the cost of making a credible threat of suicide. They are, as it were, suicide attempts that failed to fail. Insomnia is explained as an allocation of cognitive resources to solve the crisis to which the depression is a response, whereas hypersomnia (sleeping more than normal) is explained as a way of reducing productivity and thus enhancing the bargaining efficacy of the depression. The argument, while consistent with some known facts about depression, ignores a host of others, such as that depression as well as suicide run in families, that divorced individuals (with no bargaining partner) are more depression prone than the married or never married, and that stressful life events are neither necessary nor sufficient for depression.

Explaining depression as a bargaining tool is another example of a pervasive search for a meaning or function of all apparently pointless or dysfunctional behaviors (see Chapter 9). Up to a point, the search for meaning is a good research strategy; beyond this point, it becomes contrived and, as in some of the examples cited, ultimately absurd. There are so many ways in which harmful traits may be preserved in a population that one cannot take for granted that frequently occurring behavior confers reproductive fitness on the agent.7 Natural selection has certainly favored the propensity to feel physical pain, and there is no a priori reason why it could not favor the tendency to experience mental pain. But to establish the function of depression it is not enough to offer a just-so story that accounts for some of the known features of the illness. Crucially, the hypothesis must also explain facts over and above those it was constructed to explain (Chapter 1), and preferably “novel facts” that were unknown until predicted by the hypothesis.

Variation and selection

Up to this point I have assumed the standard biological case of random variation and blind deterministic selection. Selection models can take other forms, however, involving intentional variation, intentional selection, or both (see Figure 11.3].

Figure 11.3

Intentional variation, intentional selection

In The Origin of Species, Darwin wrote that “nature gives successive variations; man adds them up in certain directions useful to him.” But it is not merely a case of “Nature proposes; man disposes,” since, as he also observed, the input can be modified by human 'line-height:normal;vertical-align:baseline'>A high degree of variability is obviously favourable, as freely giving the materials for selection to work on; not that mere individual differences are not amply sufficient, with extreme care, to allow of the accumulation of a large amount of modification in almost any desired direction. But as variations manifestly useful or pleasing to man appear only occasionally, the chance of their appearance will be much increased by a large number of individuals being kept; and hence this comes to be of the highest importance to success. On this principle Marshall has remarked, with respect to the sheep of parts of Yorkshire, that “as they generally belong to poor people, and are mostly in small lots, they never can be improved.” On the other hand, nurserymen, from raising large stocks of the same plants, are generally far more successful than amateurs in getting new and valuable varieties.

Today, we can add that artificial selection can also be enhanced by inducing mutations. In addition, the maintenance of “genetic libraries” can prevent the reduction of genetic variation that is otherwise the inevitable result of selection for particular traits.

With regard to the selection process itself, Darwin distinguished between two levels of intentionality:

At the present time, eminent breeders try by methodical selection, with a distinct object in view, to make a new strain or sub-breed, superior to anything existing in the country. But, for our purpose, a kind of Selection, which may be called Unconscious, and which results from every one trying to possess and breed from the best individual animals, is more important. Thus, a man who intends keeping pointers naturally tries to get as good dogs as he can, and afterwards breeds from his own best dogs, but he has no wish or expectation of permanently altering the breed.

Non-intentional variation, intentional selection

There are many cases in which a new organism or a new form arises by accident and is then either accepted of rejected on the basis of intentional choice. Whereas natural selection tends to produce a roughly equal number of male and female organisms, gender-biased infanticide and more recently gender-biased abortion can create a serious sex imbalance in the population. In India and China alone, around 80 million women are “missing” for this reason. Eugenic policies have been widely used to prevent the mentally ill and mentally retarded from reproducing. In Nazi Germany, around three hundred thousand to four hundred thousand individuals were forcibly sterilized on these grounds. As prenatal screening techniques improve, selective abortion may become an important determinant of the makeup of human populations. If further advances make it possible to determine the sex of the child at conception, selection will have been replaced by intentional choice.

Random variation combined with intentional selection may also shape the development of artifacts. When the Norwegian minister and sociologist Eilert Sundt visited England in 1862, he learned about Darwin's theory of natural selection (published in 1859) and set about applying a variant of it to boat construction:

A boat constructor may be very skilled, and yet he will never get two boats exactly alike, even if he exerts himself to this end. The variations arising in this way may be called accidental. But even a very small variation usually is noticeable during the navigation, and it is then not accidental that the seamen come to notice that boat that has become improved or more convenient for their purpose, and that they should recommend this to be chosen as the one to imitate … One may believe that each of these boats is perfect in its way, since it has reached perfection by one-sided development in one particular direction. Each kind of improvement has progressed to the point where further developments would entail defects that would more than offset the advantage … And I conceive of the process in the following way: when the idea of new and improved forms had first been aroused, then a long series of prudent experiments, each involving extremely small changes, could lead to the happy result that from the boat constructor's shed there emerged a boat whose like all would desire.

In this text, Sundt improved on Darwin in a crucial respect.8 Whereas Darwin confessed to ignorance about the origin of variation, Sundt hit on the idea of locating its source in errors of replication, similar to typographical errors and to (what we know now to be) mutations in the genetic material. The imperfection of the boat builder – his inability to make perfect copies – is a condition for the perfection of the end result. Sundt carefully notes that the outcome of the process is a local maximum, from which no further improvements can occur by incremental changes. In the very last sentence, he also suggests that the process may turn into artificial selection, when people engage in deliberate experiments rather than letting variations arise by chance. As did Darwin, he suggested that intelligence or intentionality may occur at two levels: first when people notice than one model is more seaworthy than a previous one, and then when they understand that improvements could be accelerated if chance variation were replaced by systematic experiments.9

Intentional variation, non-intentional selection

The working of economic markets has some features in common with natural selection. The analogy has two versions, one relatively close to natural selection and one more remote. They share the premise that given the multiple limitations of human rationality, firms or managers are inefficient in the sense that they are unable to calculate the production and marketing decisions that will maximize their profit. Nevertheless the market mechanism will weed out inefficient firms, so that at any given time mainly efficient firms will be observed. Everything happens “as if” managers were efficient.

In the first and simplest version, all firms are constantly trying to increase their profits by processes of imitation and innovation. Although imitation by itself does not generate new inputs for selection to operate on, imperfect imitation may, as noted, have this result. Innovation is also, by definition, a source of new inputs. When – through sheer luck – innovation or imperfect imitation enables a firm to produce at lower cost, it can undersell its rivals and drive them out of business unless they, too, adopt the more efficient ways. By the mechanisms of bankruptcy, takeover, and imitation, these efficient techniques will spread in the population of firms. If we assume that both imitation and innovation occur predominantly in small steps and that competition takes place in an otherwise constant environment, it will bring about a local maximum of equilibrium profits.

The second version denies that firms are always trying to maximize profits. Instead, they use routines or rules of thumb that are maintained as long as profits are at a “satisfactory” level. In a neologism, they “satisfice” rather than maximize. What this means may depend on many factors, but we may assume for simplicity that a firm whose profits are consistently below the satisfactory level will either go bankrupt or face the threat of a hostile takeover. The simplest routine is to do everything as before as long as profits are at a “satisfactory” level. More complicated routines could include setting prices by a constant markup on costs or investing a certain percentage of profits in new production. The idea of satisficing is reflected in such sayings as “Never change a winning team” or “If it ain't broke, don't fix it.” In one perspective, satisficing could even be optimal. In a phrase I quoted earlier, “The greatest of all monopoly profits is a quiet life.”

Suppose now that profits fall below the satisfactory level. A firm that has been doing the same thing year in year out may be the victim of an organizational analogue of rust or sclerosis. External shocks such as a rise in oil prices or a change in an important exchange rate may increase costs or reduce revenue. Consumer demand may change; rivals may come up with better methods or new products; or workers might impose a costly strike on the firm. Whatever the cause (which may even be unknown to the firm), unsatisfactory profits will induce a search for new routines by some combination of innovation and imitation. Either procedure is likely to be predominantly local, in the sense of being limited to alternatives close to the existing routines. Large changes of any kind may be too costly for a firm that is in financial trouble (Chapter 10), and non-incremental innovations are also conceptually more demanding.

The process of imitation is obviously biased toward the behavior of successful rivals. Whether innovation is random or directed depends on the perceived causes of the crisis that triggered it. If the fall in profits below the acceptable level resulted from a rise in oil prices, the firm may bias its search in the direction of methods that will economize on oil.10 If it resulted from a change in the exchange rate between the dollar and the euro, the firm is more likely to search randomly. In all cases, however, there is a strong intentional component in the firm's behavior. The decision to change the current routines is intentional, as is the decision about how much to invest in innovation or in imitation. The choice of models to imitate is deliberate, and as just noted, the firm may intentionally bias the search for new routines in a particular direction.

The new routines that result from this process are then exposed to the blind forces of market competition. If they enable the firm to attain a satisfactory level of profit, it will switch off the search until a new crisis arises. If they do not, the firm may try again or have to declare bankruptcy. Sooner or later, non-satisficing firms are eliminated. In itself, this process does not tend to produce profit-maximizing firms. To see how that could happen, we need to bring competition more explicitly into the picture. If we assume that as one of their routines firms invest a fixed percentage of profits in new production, those that by sheer luck have hit upon a better routine than their competitors will expand so that over time their routines become more heavily represented in the population of firms.11

Selection and as-if rationality

The usefulness of these models depends on a simple empirical question: what is the rate at which inefficient firms are eliminated compared to the rate of change of the environment? In the previous chapter I raised the same question with regard to natural selection and argued that the highly fine-tuned adaptation of organisms to their environment suggests that the latter must have changed relatively slowly. In the case of the economic environment, we can make a more direct assessment. In the modern world, firms are exposed to unprecedented rates of change. If they were reduced to incremental tracking of the environment, firms would be chronically unfit. Successful firms are more likely to be those that are capable of anticipating change, by aiming ahead of the target. This strategy, too, will fail much of the time, but at least not all the time. Moreover, because of their political clout large corporations may also be able to shape the environment in which they operate. In an earlier age of cutthroat capitalism among small firms, selection mechanisms of the kind I have described may or may not have been important – we do not know. Today, they are unlikely to explain much of what we observe.

There is also a more general issue at stake. When attacked for the lack of realism of their assumptions, rational-choice theorists routinely assert that they only claim to explain behavior on the assumption that people act “as if” they maximize utility (or profit, or any other aim). Milton Friedman offered two seductive and influential analogies to persuade his readers of the reality of maximizing behavior that does not rely on maximizing calculations. First, “leaves [on a tree] are positioned as if each leaf deliberately sought to maximize the amount of sunlight it receives, given the position of its neighbors, as if it knew the physical laws determining the amount of sunlight that would be received in various positions and could move rapidly or instantaneously from any one position to any other desired and unoccupied position.” Second, “excellent predictions would be yielded by the hypothesis that the [expert] billiard player made his shots as if he knew the complicated mathematical formulas that would give the optimum directions of travel, could estimate accurately by eye the angles, etc., describing the location of the balls, could make lightning calculations from the formulas, and could then make the balls travel in the direction indicated by the formulas.”

While seductive, the analogies are unpersuasive. The leaves simulate maximization because natural selection eliminated trees that did not. To assume that a similar mechanism exists for economic behavior is to beg the question. Expert billiard players are experts because ten thousand hours of practicing enable them somehow (we do not know how) to make the right shots on an intuitive basis. Chess grandmasters can instantly recognize about 50,000 constellations on the board. These are, of course, tightly constrained situations. To extrapolate the argument to business decisions in a fluid and opaque environment is unwarranted.

The most general way of stating my objection is perhaps that even if it could be shown that market competition does improve efficiency through elimination of inefficient firms, there is a vast step from “improving efficiency” to the ultrasophisticated as-if maximization imputed to firms in economic models.12

In the political sphere, electoral competition is supposed to ensure that the only politicians we observe are those who are elected or reelected; hence one can assume that all politicians act “as if “they are concerned only with their election prospects. The leap from a concern with election to an exclusive concern is not justified, however. A methodologically unprejudiced look at politics suggests that there are three kinds of political actors: opportunists (who care only about getting elected and reelected), reformers (who care about their policies being implemented), and activists or militants (who care more about “making a statement”).13 The view of politics as based on the interaction among these three groups in each party – and among different parties – is clearly more realistic than the “ice cream stall” model of politics (Chapter 18) according to which vote-maximizing parties would all converge to the center. For a striking refutation of the claim that politicians are motivated only by reelection concerns, consider the line of French politicians originating in Jean Jaurès and passing through Léon Blum, Pierre Mendès-France, and Michel Rocard, all of whom were transparently motivated by a desire to promote the impartial values of social justice and economic efficiency. It has to be said, though, that in Rocard's case his distaste for electoral politics did detract from his political efficacy.

Outside the arenas of competition, “as-if” rationality has even less justification. Consumer choices, voting behavior, church attendance, choice of career, and most other behaviors one could name are not subject to selection mechanisms that mimic rationality. They are, to be sure, subject to constraints that can reduce the importance of choice in general and of rational choice in particular (Chapter 10). Constraints operate before the fact, to make certain choices unfeasible. Selection operates after the fact, to eliminate those who have made certain choices. Although both mechanisms contribute to the explanation of behavior, they cannot, jointly or singly, account for all of it. Choice remains the core concept in the social sciences.

Bibliographical note

In “Selection by consequences,” Science 213 (1981), 501–4, B. F. Skinner argued for the importance of three ways in which behavior can be explained by its consequences: by natural selection operating on individuals, by reinforcement, and (although he does not use that term) by group selection. A useful introduction to reinforcement theory is J. E. R. Staddon, Adaptive Behavior and Learning (Cambridge University Press, 1983). A study of how reinforcement theory can be used to shape (rather than explain) behavior is D. Lee and P. Belflore, “Enhancing classroom performance: a review of reinforcement schedules,” Journal of Behavioral Education 7 (1997), 205–17. A classic exposition of the theory of natural selection, notable for the insistence on the individualistic nature of selection, is G. Williams, Adaptation and Natural Selection (Princeton University Press, 1966). For a discussion of gradient climbing and “the metaphor of fitness landscapes,” see Chapter 2.4 of S. Gavrilets, Fitness Landscapes and the Origin of Species (Princeton University Press, 2004). An exposition emphasizing the gene as the unit of selection is R. Dawkins, The Selfish Gene, 2nd edn (Oxford University Press, 1990). An excellent introduction to animal signaling is S. A. Searchy and S. Nowicki, The Evolution of Animal Communication (Princeton University Press, 2005). For a discussion of how group selection might be made possible by altruistic punishment, see E. Fehr and U. Fischbacher, “Social norms and human cooperation,” Trends in Cognitive Sciences 8 (2004), 185–90. A seminal study of “tit-for-tat” cooperation between unrelated animals is R. Axelrod and W. Hamilton, “The evolution of cooperation,” Science 211 (1981), 1390–6. The data on homicide statistics and child abuse are from M. Daly and M. Wilson, Homicide (New York: Aldine de Gruyter, 1988). For objections to their explanation, see Chapter 7 of D. Buller, Adapting Minds (Cambridge, MA: MIT Press, 2005). For two sides of the self-deception argument, see R. Trivers, Social Evolution (Menlo Park, CA: Benjamin-Cummings, 1985) (favoring an evolutionary explanation), and V. S. Ramachandran and S. Blakeslee, Phantoms in the Brain (New York: Quill, 1998) (opposing it). For two sides of the adaptive nature of depression, see E. H. Haggen, “The bargaining model of depression,” in P. Hammerstein (ed.), Genetic and Cultural Evolution of Cooperation (Cambridge, MA: MIT Press, 2003) (favoring an evolutionary explanation), and P. Kramer, Against Depression (New York: Viking, 2005) (opposing it). The analysis of markets in terms of natural selection originates in A. Alchian, “Uncertainty, evolution, and economic theory,” Journal of Political Economy 58 (1950), 211–21. Its most sophisticated version (which does not support “as-if” maximization) is R. Nelson and S. Winter, An Evolutionary Theory of Economic Change (Cambridge, MA: Harvard University Press, 1982). The theory of “satisficing” derives from H. Simon, “A behavioral theory of rational choice,” Quarterly Journal of Economics 69 (1954), 99–118. The economics of team sports is the subject of D. Berri, M. Schmidt, and S. Brook, The Wages of Wins (Stanford University Press, 2006). The distinction among opportunists, reformers, and activists is taken from J. Roemer, Political Competition (Cambridge, MA: Harvard University Press, 2001).

1 In Chapter 4, I distinguished between consequentialist and non-consequentialist actions. For present purposes, the action itself may be included among its consequences. This is a purely terminological issue.

2 Recall, however, that we are not always very good at noticing which of two experiences was the more painful (Chapter 6).

3 It is also easier to establish when the gambling technology allows for the possibility of near-wins. Although each of the near-wins is less reinforcing than an actual win, there are more of them.

4 Their good luck, in this case, is their bad luck, and the casino's good luck.

5 One might ask, however, whether a population of cooperating punishers might not risk being invaded by cooperating non-punishers, since punishing is costly for the punisher. The cooperating non-punishers might in turn be invaded by non-cooperators, and a new cycle might start. To my knowledge, this conundrum is not fully resolved.

6 The incest taboo may, therefore, address a temptation that exists more rarely than has been thought. Freud, by contrast, thought the incest taboo had arisen to counteract an unconscious desire to have sex with close relatives.

7 A given gene may code for several behaviors (pleiotropy) and be maintained even if one of them by itself is suboptimal. Suboptimal features may also be maintained by a variety of other genetic mechanisms related to the fact that sexually reproducing organisms have two different variants (alleles) of each gene.

8 The improvement was possible, of course, only because he addressed a different problem, since in 1862 nobody had the conceptual wherewithal to imagine that the source of variation in organisms could be random replication mistakes. This leap became possible only after Mendel had shown the discrete nature of the units of inheritance (genes) and Watson and Crick demonstrated that replication was involved in the process of inheritance. I wonder what Darwin would have answered had Sundt asked him whether the source of biological variation might not be imperfection in the reproductive machinery.

9 In this way, one could also prevent the unfortunate situation that would arise if boat builders became so good that they never made mistakes.

10 Unless further increases in the price of oil are to be expected, rationality does not require the firm to look for oil-saving innovations, since no one can know what the set of feasible innovations looks like. Yet the increase in the price of oil will tend to make these innovations more salient.

11 In this version they will not deliberately try to drive their rivals out of the market, for instance, by using the high profits to sell below cost until the others give up, since they have no concern for more-than-satisfactory profits.

12 As a small wrinkle to the argument, the economics of team sports offers a possible objection to the idea that profit maximization is brought about by selection. If profit-maximizing baseball or football teams used their profits to buy up all the best players in their league, their superiority would become so overwhelming that the games would lose much of their uncertainty, and hence of their fun, and hence of their profit-generating ability.

13 The three groups can be more formally distinguished as follows. Opportunists prefer to propose policy A to policy B when the probability of winning at A is greater than the probability of winning at B, given that the opposition party is proposing some fixed C. Militants prefer to propose A to B when the average party member would derive higher utility at A than at B (independently of what C is). Reformists prefer to propose A to B, given that the opposition is proposing C, when the expected utility of the average party member is higher at A than at B. Thus opportunists are concerned only with probabilities, activists only with utilities, and reformers with both.

If you find an error or have any questions, please email us at admin@erenow.org. Thank you!