2
Understanding consists in imagining the fact.
—LUDWIG WITTGENSTEIN, THE BIG TYPESCRIPT
Cognitive processes are a product of natural selection, but they are not its target. Indeed, natural selection cannot even “see” cognition; it can only “see” the effects of cognition in organizing and regulating overt actions (Piaget, 1971). In evolution, being smart counts for nothing if it does not lead to acting smart.
The two classic theories of animal behavior, behaviorism and ethology, both focused on overt actions, but they somehow forgot the cognition. Classical ethology had little or no interest in animal cognition, and classical behaviorism was downright hostile to the idea. Although contemporary instantiations of ethology and behaviorism take some account of cognitive processes, they provide no systematic theoretical accounts. Nor are any other modern approaches to the evolution of cognition sufficient for current purposes.
And so to begin this account of the evolutionary emergence of uniquely human thinking, we must first formulate, in broad outline, a theory of the evolution of cognition more generally. We may then begin our natural history proper by using this theoretical framework to characterize processes of cognition and thinking in modern-day great apes, as representative of humans’ evolutionary starting point before they separated from other primates some six million years ago.
Evolution of Cognition
All organisms possess some reflexive reactions that are organized linearly as stimulus-response linkages. Behaviorists think that all behavior is organized in this way, though in complex organisms the linkages may be learned and become associated with others in various ways. The alternative is to recognize that complex organisms also possess some adaptive specializations that are organized circularly, as feedback control systems, with built-in goal states and action possibilities. Starting from this foundation, cognition evolves not from a complexifying of stimulus-response linkages but, rather, from the individual organism gaining (1) powers of flexible decision-making and behavioral control in its various adaptive specializations, and (2) capacities for cognitively representing and making inferences from the casual and intentional relations structuring relevant events.
Adaptive specializations are organized as self-regulating systems, as are many physiological processes such as the homeostatic regulation of blood sugar and body temperature in mammals. These specializations go beyond reflexes in their capacity to produce adaptive behavior in a much wider range of circumstances, and indeed, they may be quite complex, for example, spiders spinning webs. There is no way that a spider can spin a web using only stimulus-response linkages. The process is too dynamic and dependent on local context. Instead, the spider must have goal states that it is motivated to bring about, and the ability to perceive and act so as to bring them about in a self-regulated manner. But adaptive specializations are still not cognitive (or only weakly cognitive) because they are unknowing and inflexible by definition: perceived situations and behavioral possibilities for goal attainment are mostly connected in an inflexible manner. The individual organism does not have the kind of causal or intentional understanding of the situation that would enable it to deal flexibly with “novel” situations. Natural selection has designed these adaptive specializations to work invariantly in “the same” situations as those encountered in the past, and so cleverness from the individual is not needed.
Cognition and thinking enter the picture when organisms live in less predictable worlds and natural selection crafts cognitive and decision making processes that empower the individual to recognize novel situations and to deal flexibly, on its own, with unpredictable exigencies. What enables effective handling of a novel situation is some understanding of the causal and/or intentional relations involved, which then suggests an appropriate and potentially novel behavioral response. For example, a chimpanzee might recognize that the only tool available to her in a given situation demands, based on the physical causality involved, manipulations she has never before performed toward this goal. A cognitively competent organism, then, operates as a control system with reference values or goals, capacities for attending to situations causally or intentionally “relevant” to these reference values or goals, and capacities for choosing actions that lead to the fulfillment of these reference values or goals (given the causal and/or intentional structure of the situation). This description in control system terms is basically identical to the classic belief-desire model of rational action in philosophy: a goal or desire coupled with an epistemic connection to the world (e.g., a belief based on an understanding of the causal or intentional structure of the situation) creates an intention to act in a particular way.1
We will refer to this flexible, individually self-regulated, cognitive way of doing things as individual intentionality. Within this self-regulation model of individual intentionality, we may then say that thinking occurs when an organism attempts, on some particular occasion, to solve a problem, and so to meet its goal not by behaving overtly but, rather, by imagining what would happen if it tried different actions in a situation—or if different external forces entered the situation—before actually acting. This imagining is nothing more or less than the “off-line” simulation of potential perceptual experiences. To be able to think before acting in this way, then, the organism must possess the three prerequisites outlined above: (1) the ability to cognitively represent experiences to oneself “off-line,” (2) the ability to simulate or make inferences transforming these representations causally, intentionally, and/or logically, and (3) the ability to self-monitor and evaluate how these simulated experiences might lead to specific behavioral outcomes—and so to make a thoughtful behavioral decision. The success or failure of a particular behavioral decision exposes the underlying processes of representation, simulation, and self-monitoring—indirectly, as it were—to the unrelenting sieve of natural selection.
Cognitive Representation
Cognitive representation in a self-regulating, intentional system may be characterized both in terms of its content and in terms of its format. In terms of content, the claim here is that both the organism’s internal goals and its externally directed attention (NB: not just perception but attention) have as content not punctate stimuli or sense data, but rather whole situations. Goals, values, and other reference values (pro-attitudes) are cognitive representations of situations that the organism is motivated to bring about or maintain. Although we sometimes speak of an object or location as someone’s goal, this is really only a shorthand way of speaking; the goal is the situation of having the object or reaching the location. The philosopher Davidson (2001) writes, “Wants and desires are directed to propositional contents. What one wants is … that one has the apple in hand.… Similarly … someone who intends to go to the opera intends to make it the case that he is at the opera” (p. 126). In this same manner, modern decision theory often speaks of the desire or preference that a particular state of affairs be realized.
If goals and values are represented as desired situations, then what the organism must attend to in its perceived environment is situations relevant to those goals and values. Desired situations and attended-to environmental situations are thus perforce in the same perceptually based, fact-like representational format, which enables their cognitive comparison. Of course, complex organisms also perceive less complex things, such as objects, properties, and events—and can attend to them for specific purposes—but in the current analysis they always do so as components of situations relevant to behavioral decision making.
To illustrate the point, let us suppose that the image in Figure 2.1 is what a chimpanzee sees as she approaches a tree while foraging.
FIGURE 2.1 What a chimpanzee sees
The chimpanzee perceives the scene in the same basic way that we would; our visual systems are similar enough that we see the same basic objects and their spatial relationships. But what situations does the chimpanzee attend to? Although she could potentially focus her attention on any of the potentially infinite situations that this image presents, at the current moment she must make a foraging decision, and so she attends to the situations or “facts” relevant to this behavioral decision, to wit (as described in English):
• that many bananas are in the tree
• that the bananas are ripe
• that no competitor chimpanzees are already in the tree
• that the bananas are reachable by climbing
• that no predators are nearby
• that escaping quickly from this tree will be difficult
• etc., etc.
For a foraging chimpanzee with the goal of obtaining food, given all of its perceptual and behavioral capacities and its knowledge of the local ecology, all of these are relevant situations for deciding what to do—all present in a single visual image and, of course, nonverbally. (NB: Even the absence of something expected, such as food not in its usual location, may be a relevant situation.)
Relevance is one of those occasion-sensitive judgments that cannot be given a general definition. But in broad strokes, organisms attend to situations as either (1) opportunities or (2) obstacles to the pursuit and maintenance of their goals and values (or as information relevant to predicting possible future opportunities or obstacles). Different species have different ways of life, of course, which means that they perceive or attend to different situations (and components of situations). Thus, for a leopard, the situation of bananas in a tree would not represent an opportunity to eat, but the presence of a chimpanzee would. For the chimpanzee, in contrast, the leopard’s presence now presents an obstacle to its value of avoiding predators, and so it should look for a situation providing opportunities for escape, such as a tree to climb without low-hanging limbs—given its knowledge that leopards cannot climb such trees and its familiarity with its own tree-climbing prowess. If we now throw into the mix a worm resting on the banana’s surface, the relevant situations for the three different species—the obstacles and opportunities for their respective goals—would overlap even less, if at all. Relevant situations are thus determined jointly by the organism’s goals and values, its perceptual abilities and knowledge, and its behavioral capacities, that is to say, by its overall functioning as a self-regulating system. Identifying situations relevant for a behavioral decision thus involves an organism’s whole way of life (von Uexküll, 1921).2
In terms of representational format, the key is that to make creative inferences that go beyond particular experiences, the organism must represent its experiences as types, that is to say, in some generalized, schematized, or abstract form. One plausible hypothesis is a kind of exemplar model in which the individual in some sense “saves” the particular situations and components to which it has attended (many models of knowledge representation have attention as the gateway). There is then generalization or abstraction across these in a process that we might call schematization. (Langacker’s [1987] metaphor is of a stack of transparencies, each depicting a single situation or entity, and schematization is the process of looking down through them for overlap.) We might think of the result of this process of schematization as cognitive models of various types of situations and entities, for example, categories of objects, schemas of events, and models of situations. Recognizing a situation or entity as a token of a known type—as an exemplar of a cognitive category, schema, or model—enables novel inferences about the token appropriate to the type.
Categories, schemas, and models as cognitive types are nothing more or less than imagistic or iconic schematizations of the organism’s (or, in some cases, its species’) previous experience (Barsalou, 1999, 2008). As such, they do not suffer from the indeterminacy of interpretation that some theorists attribute to iconic representations considered as mental pictures, that is, the indeterminacy of whether this image is of a banana, a fruit, an object, and so forth (Crane, 2003). They do not because they are composed of individual experiences in which the organism was attending to a relevant (already “interpreted”) situation. Thus, the organism “interprets,” or understands, particular situations and entities in the context of its goals as it assimilates them to known (cognitively represented) types: “This is another one of those.”
Simulation and Inference
Thinking in an organism with individual intentionality involves simulations or inferences that connect cognitive representations of situations and their components in various ways. First are those instrumental inferences that occur in behavioral decision making of the “what would happen if …” variety. For example, in a concrete problem-solving situation—such as a rock preventing the movement of a stick beneath it—some organisms might go through the kind of inferential simulation that Piaget (1952) called “mental trial and error”: the organism imagines a potential action and its consequences. Thus, a chimpanzee might simulate imaginatively what would happen if she forcefully tugged at the stick, without actually doing it. If she judged that this would be futile given the size and weight of the rock, she might decide to push the rock aside before pulling the stick.
Also possible are inferences about causal and intentional relations created by outside forces and how these might affect the attainment of goals and values. For example, a chimpanzee might see a monkey feeding in the banana tree and infer that there are no leopards nearby (because if there were, the monkey would be fleeing). Or, upon finding a fig on the ground, a bonobo might infer that it will have a sweet taste and that there is a seed inside—based on a categorization of the encountered fig as “another one of those” and the natural inference that this one will have the same properties as others in the category. Or an orangutan might recognize a conspecific climbing a tree as an intentional event of a particular type, and then infer something about goals and attention as intentional causes and so predict the climber’s impending actions. Schematizing over such experiences (again aided, perhaps, by the experience of the species), individuals may potentially build cognitive models of general patterns of causality and intentionality.
The best way to conceptualize such processes is in terms of off-line, image-based simulations, including novel combinations of represented events and entities that the individual herself has never before directly experienced as such, for example, an ape imagining what the monkey would do if a leopard entered the scene (see Barsalou, 1999, 2008, for relevant data on humans; Barsalou, 2005, extends the analysis to nonhuman primates). Importantly, the combinatorial processes themselves will include causal and intentional relations that connect different real and imagined situations, and also “logical” operations such as the conditional, “negation,” exclusion, and the like. These logical operations are not themselves imagistic cognitive representations but, rather, cognitive procedures (enactive, in Bruner’s terms, or operative, in Piaget’s terms) that the organism accesses only through actual use. Concrete examples of how this works will be given when we look more closely at great ape thinking in the section that follows.
Behavioral Self-Monitoring
To think effectively, an organism with individual intentionality must be able to observe the outcome of its actions in a given situation and evaluate whether they match the desired goal state or outcome. Engaging in some such processes of behavioral self-monitoring and evaluation is what enables learning from experience over time.
A cognitive version of such self-monitoring enables the agent, as noted above, to inferentially simulate a potential action-outcome sequence ahead of time and observe it—as if it were an actual action-outcome sequence—and then evaluate the imagined outcome. This process creates more thoughtful decision making through the precorrection of errors. (Dennett [1995] calls it Popperian learning because failure means that my hypothesis “dies,” not me.) For example, consider a squirrel on one tree branch gearing up to jump to another. One can see the muscles preparing, but in some cases the squirrel decides the leap is too far and so, after feigning some jumps, climbs down the trunk and then back up the other branch. The most straightforward description of this event is that the squirrel is observing and evaluating a simulation of what it would experience if it leaped; for example, it would experience missing the branch and falling—a decidedly negative outcome. The squirrel must then use that simulation to make a decision about whether to actually leap. Okrent (2007) holds that imagining the possible outcomes of different behavioral choices ahead of time, and then evaluating and deciding for the one with the best imagined outcome, is the essence of instrumental rationality.
This kind of self-monitoring, requiring what some call executive functioning, is cognitive because the individual, in some sense, observes not just its actions and their results in the environment but also its own internal simulations. It is also possible for the organism to assess things like the information it has available for making a decision in order to predict the likelihood that it will make a successful choice (before it actually chooses). Humans even use the imagined evaluations of other persons—or the imagined comprehension of others in the case of communication—to evaluate potential behavioral decisions. Whatever its specific form, internal self-monitoring of some kind is critical to anything we would want to call thinking, as it constitutes, in some sense, the individual knowing what it is doing.
Thinking like an Ape
We begin our natural history of the evolutionary emergence of uniquely human thinking with a focus on the last common ancestor of humans and other extant primates. Our best living models for this creature are humans’ closest primate relatives, the nonhuman great apes (hereafter, great apes), comprising chimpanzees, bonobos, gorillas, and orangutans—especially chimpanzees and bonobos, who diverged from humans most recently, around 6 million years ago. When cognitive abilities are similar among the four species of great ape but different in humans, we presume that the apes have conserved their skills from the last common ancestor (or before) whereas humans have evolved something new.
Our characterizations of the cognitive skills of this last common ancestor will derive from empirical research with great apes, cast in the theoretical framework of individual intentionality just elaborated: behavioral self-regulation involving cognitive models and instrumental inferences, along with some form of behavioral self-monitoring. Because humans share with other apes such a recent evolutionary history—along with the same basic bodies, sense organs, emotions, and brain organization—in the absence of evidence our default assumption will be evolutionary continuity (de Waal, 1999). That is to say, when great apes behave identically with humans, especially in carefully controlled experiments, we will assume continuity in the underlying cognitive processes involved. The onus of explanation is thus on those who posit evolutionary discontinuities, a challenge we embrace in later chapters.
Great Apes Think about the Physical World
Processes of great ape cognition and thinking may be usefully divided into those concerning the physical world, structured by an understanding of physical causality, and those concerning the social world, structured by an understanding of agentive causality, or intentionality. Primate cognition of the physical world evolved mainly in the context of foraging for food (see Tomasello and Call, 1997, for this theoretical claim and supporting evidence); this is thus its “proper function” (in Millikan’s [1987] sense). In order to procure their daily sustenance, primates (as mammals in general) evolved the proximate goals, representations, and inferences for (1) finding food (requiring skills of spatial navigation and object tracking), (2) recognizing and categorizing food (requiring skills of feature recognition and categorization), (3) quantifying food (requiring skills of quantification), and (4) procuring or extracting food (requiring skills of causal understanding). In these most basic skills of physical cognition, all nonhuman primates would seem to be generally similar (Tomasello and Call, 1997; Schmitt et al., 2012).
What great apes are especially skillful at, compared with other primates, is tool use—which one might characterize as not just understanding causes but actually manipulating them. Other primates are mostly not skilled tool users at all, and when they are it is typically in only one fairly narrow context (e.g., Fragaszy et al., 2004). In contrast, all four species of great ape are highly skilled at using a variety of tools quite flexibly, including using two tools in succession in a task, using one tool to rake in another (which is then needed to procure food), and so forth (Herrmann et al., 2008). Classically, tool use is thought to require the individual to assess the causal effect of its tool manipulations on the goal object or event (Piaget, 1952), and so the flexibility and alacrity with which great apes succeed in using novel tools suggest that they have one or more general cognitive models of causality guiding their use of these novel tools.
Great apes’ skills with manipulating causal relations via tools may be combined in interesting ways with processes of cognitive representation and inference. For example, Marín Manrique et al. (2010) presented chimpanzees with a food extraction problem that they had never before seen. Its solution required a tool with particular properties (e.g., it had to be rigid and of a certain length). The trick was that the potential tools they could use were in a different room, out of sight of the problem. To solve this task, individuals had to first comprehend the causal structure of the novel problem, and then keep that structure cognitively represented while approaching and choosing a tool in the other room. Many individuals did this, often from the first trial onward, suggesting that they assimilated the novel problem to a known cognitive model having a certain causal structure, which they then kept with them as they entered the adjoining room. They then simulated the use of at least some of the available tools and the likely outcome in each case through the medium of this cognitive model—before actually choosing a tool overtly. In the study of Mulcahy and Call (2006), bonobos even saved a tool for future use, presumably imagining the future situation in which they would need it.
The simulations or inferences involved here have logical structure. This is not the structure of formal logic but, rather, a structure based on causal inferences. The idea is that causal inferences have a basic if-then logic and so lead to “necessary” conclusions: if A happens, then B happens (because A caused B). Bermudez (2003) calls inferences of this type protoconditional because the necessity is not a formal one but a causal one. In the experiment of Marín Manrique et al. (2010), as an ape simulates using the different tools, she infers “if a tool with property A is used, then B must happen.” One thus gets a kind of proto-modus ponens by then actually using the tool with property A in the expectation that B will indeed happen as a causal result (if A happens, then B happens; A happens; therefore B will happen). This is basically a forward-facing inference, from premise or cause to conclusion or effect.
In another set of recent experiments, we can see backward-facing inferences, that is, from effect to cause. Call (2004) showed chimpanzees a piece of food, which was then hidden in one of two cups (they did not know which). Then, depending on condition, the experimenter shook one of the cups. The relevant background knowledge for success in this experiment is as follows: (1) the food is in one of the two cups (learned in pretraining), and (2) shaking the cup with food will result in noise, whereas shaking the cup without food will result in silence (causal knowledge brought to the experiment). The two conditions are shown in Figure 2.2, using iconic representations to depict something of the way the apes understand the situation.
(The iconic diagrams modeling great ape cognitive representations in Figure 2.2 are not uninterpreted pictures but symbols in a theoretical metalanguage that mean what we agree that they mean. So they are meant to depict the ape’s interpreted experience when she has seen the cup as a cup and the noise as coming from the cup, and so on. Importantly, these diagrams are created within the confines of a restrictive theory of the possibilities of great ape cognition. Following Tomasello’s (1992) depictions for one-year-old human children, we make the diagrams out of concrete spatial-temporal-causal elements that may be posited to be a part of the apes’ cognitive abilities based on empirical research. Then, the logical structure—based on the protoconditional and protonegation—is posited to be necessary to explain apes’ actions in specific experimental situations. The logical operations are depicted in English words, since the ape does not have perception-based representations of them, but only procedural competence with them.)
In condition 1, an experimenter shook the cup with food. In this case the chimpanzee observed a noise being made and had to infer backward in the causal chain to what might have caused it, specifically, the food hitting the inside of the cup. This is a kind of abduction (not logically valid, but an “inference to best explanation”). That is, (1) the shaking cup is making noise; (2) if the food were inside the shaking cup, then it would make noise; (3) therefore, the food is inside the cup. In condition 2, the experimenter shook the empty cup. In this case the chimpanzee observed only silence and had to infer backward in the causal chain to why that might be, specifically, that there was no food in the cup. This is a kind of proto-modus tollens: (1) the shaking cup is silent; (2) if the food were inside the shaking cup, then it would make noise; (3) therefore, the food must not be in the cup (the shaken cup must be empty). The chimpanzees made this inference, but they also made an additional one. They combined their understanding of the causality of noise making in this context with their preexisting knowledge that the food was in one of the two cups to locate the food in the other, nonshaken cup (if the food is not in this one, then it must be in that one; see bottom row in Figure 2.2). This inferential paradigm thus involves the kind of exclusion inference characteristic of a disjunctive syllogism.
FIGURE 2.2 Ape inferences in finding hidden food (Call, 2004)
Negation is a very complex cognitive operation, and one could easily object to the use of negation in these proposed accounts of great ape logical inferences. But Bermudez (2003) makes a novel theoretical proposal about some possible evolutionary precursors to formal negation that make these accounts much more plausible. The proposal is to think of a kind of protonegation as simply comprising exclusionary opposites on a scale (contraries), such as presence-absence, noise-silence, safety-danger, success-failure, and available-not available. If we assume that great apes understand polar opposites such as these as indeed mutually exclusive—for example, if something is absent, it cannot be present, or if it makes noise it cannot be silent—then this could be a much simpler basis for the negation operation. All of the current descriptions assume protonegation of this type.
When taken together, the conditional (if-then) and negation operations structure all of the most basic paradigms of human logical reasoning. The claim is thus that great apes can solve complex and novel physical problems by assimilating key aspects of the problem situation to already known cognitive models with causal structure and then use those models to simulate or make inferences about what has happened previously or what might happen next—employing both a kind of protoconditional and a kind of protonegation in both forward-facing and backward-facing paradigms. Our general conclusion is thus that since the great apes in these studies are using cognitive models containing general principles of causality, and they are also simulating or making inferences in various kinds of protological paradigms, with various kinds of self-monitoring along the way, what the great apes are doing in these studies is thinking.
Great Apes Think about the Social World
Primate cognition of the social world evolved mainly in the context of competition within the social group for food, mates, and other valued resources (see Tomasello and Call, 1997); competitive social interactions are thus its “proper function.” In order to outcompete groupmates, individual primates evolved the proximate goals, representations, and inferences for (1) recognizing individuals in their social group and forming dominance and affiliative relationships with them and (2) recognizing third parties’ social relationships with one another, such as parent or dominant or friend, and taking these into account. These abilities enable individuals to better predict the behavior of others in a complex “social field” (Kummer, 1972). Despite important species differences of social structure and interaction, in these most basic of skills of social cognition all primates would appear to be generally similar (e.g., see Tomasello and Call, 1997; see also the chapters in Mitani et al., 2012).
Beyond recognizing social relationships based on observed social interactions, great apes also understand that other individuals have goal-situations that they are pursuing and perceived-situations in the environment that they are attending to—so that together the individual’s goals and perceptions (and her assessment of any relevant obstacles and opportunities for goal achievement in the environment) determine her behavior. This means that nonhuman great apes not only are intentional agents themselves but also understand others as intentional agents (i.e., as possessing individual intentionality; Call and Tomasello, 2008).
Consider the following experiment. Hare et al. (2000) had a dominant and subordinate chimpanzee compete for food in a novel situation in which one piece of food was out in the open and one piece of food was on the subordinate’s side of a barrier where only she could see it. In this situation the subordinate knew that the dominant could see the piece out in the open and so he would go for it as soon as he could, whereas he could not see the other piece (i.e., he saw the barrier only) and so would not go for that piece (i.e., he would stay on his side). When her door was opened (slightly before the dominant’s), the subordinate chose to pursue the food on her side of the barrier; she knew what the dominant could and could not see. In an important variation, subordinate chimpanzees avoided going for food that a dominant could not see now but had seen hidden in one of the locations some moments before; they knew that he knew where the hidden food was located (Hare et al., 2001; Kaminski et al., 2008). In still another variation, in a back-and-forth foraging game, chimpanzees knew that if their competitor chose first, he would choose a board that was lying slanted on the table (as if something were underneath) rather than a flat board (under which there could be nothing); they knew what kind of inference he would make in the situation (Schmelz et al., 2011). Chimpanzees thus know that others see things, know things, and make inferences about things.
But beyond exploiting their understanding of what others do and do not experience and how this affects their behavior, great apes sometimes even attempt to manipulate what others experience. In a series of experiments, Hare et al. (2006) and Melis et al. (2006a) had chimpanzees compete with a human (sitting in a booth-like apparatus) for two pieces of food. In some conditions, the human could see the ape equally well if it approached either piece of food; in these cases, the apes had no preference for either piece. But in the key condition, a barrier was in place so that the apes could approach one piece of food without being seen—which is exactly what they did. They even did this when they themselves could not see the human in either case. (They had to choose to reach for food from behind a barrier in both cases, but through a clear tunnel in one case and an opaque one in the other.) Perhaps most impressive, the same individuals also preferentially chose to pursue food that they could approach silently—so that the distracted human competitor did not know they were doing so—as opposed to food that required them to make noise en route. This generalization to a completely different perceptual modality speaks to the power and flexibility of the cognitive models and inferences involved.
Importantly analogous to the domain of physical cognition, the chimpanzees in these studies not only made productive inferences based on a general understanding of intentionality but also connected their inferences into paradigms to both predict and even manipulate what others would do (see Figure 2.3). The background knowledge required in all of these food competition experiments is that a competitor will go for a piece of food if and only if (1) he has the goal of having it and (2) he perceives its location (e.g., at location A). The protoconditional inferences in the Hare et al. (2000) experiment follow straightforwardly from this: if the dominant wants the banana and sees it at location A, then she will go to location A. Also analogous to the domain of physical cognition, in these food competition experiments chimpanzees make use of protonegation. Thus, conceptualizing protonegation in terms of polar opposites, in the Hare et al. experiments chimpanzees know that if the competitor sees only the barrier, then she will stay in place (i.e., if she does not see the food she will not go for it; see Figure 2.3, condition C). In the Melis et al. (2006a) concealment experiments, chimpanzees also understand that if the human sees only the barrier, or hears only silence, she will remain sitting peacefully (i.e., if she does not see or hear my approach, she will not reach for the food), and so they approach such that the other sees only barrier or hears only silence.3
And so, just as in the domain of physical cognition, in the domain of social cognition what great apes are especially good at is manipulation. This special facility for social manipulation also comes out clearly in their gestural communication. (Their vocal communication is mostly hardwired, and similar to that of monkeys, so not of great interest for the question of thinking.) All four species of great apes communicate with others gesturally in ways that nonape primates mostly do not. They ritualize from their social interactions with others certain intention-movements—such as raising the arm to begin play-hitting as an instigation to play—that they then use flexibly to manipulate the behavior of others. Perhaps even more important, they also use a number of attention-getting gestures—for example, slapping the ground to get others to look at them—in order to manipulate the attention of others. And they even adopt with humans something like a ritualized reaching or pointing gesture, one that is not in their natural repertoire with conspecifics, demonstrating especially clearly the flexibility of great apes’ skills for manipulating the behavior and attention of others (see Call and Tomasello, 2007, for a review). Great apes’ gestural communication thus shows again their special skills at manipulating causes.
A final experiment demonstrates something like backward-facing inferences in the social domain. Buttelmann et al. (2007) tested six human-raised chimpanzees in the so-called rational imitation paradigm of Gergely et al. (2002). Individuals saw a human perform an unusual action on an apparatus to produce an interesting result. The idea was that in one condition the physical constraints of the situation forced the human to use an unusual action; for example, he had to turn on a light with his head because his hands were occupied holding a blanket, or he had to activate a music box with his foot because his hands were occupied with a stack of books. When given their turn with the apparatus and no constraints in effect, the chimpanzees discounted the unusual action and used their hands as they normally would. However, when they saw the human use the unusual action when there was no physical constraint dictating this—he just turned on the light with his head for no discernable reason—they quite often copied the unusual behavior themselves. The most natural interpretation of this differentiated pattern of response would be that the apes employed a kind of proto-modus tollens, from effect to cause with protonegation, similar to that in the Call (2004) shaking cups study: (1) he is not using his hands; (2) if he had a free choice, he would be using his hands; (3) therefore he must not have a free choice (in one case for obvious reasons; in the other not).
FIGURE 2.3 Ape inferences in competing for food (Hare et al., 2000)
These studies demonstrate that great apes can solve complex social problems, just as they solve complex physical problems, by assimilating key aspects of the problem situation to a cognitive model—which in this case embodies a general understanding of intentionality—and then using that model to simulate or make inferences about what has happened or what might happen next. Great apes employ both a kind of protoconditional and a kind of protonegation—in both forward-facing and backward-facing modes—in the context of protological paradigms of social inferring. Our conclusion is thus that in the social domain, as well as the physical domain, what the great apes in these studies are doing is thinking.
Cognitive Self-Monitoring
Great apes in these studies are clearly not just automatically flipping through behavioral alternatives and reacting to a goal match; they monitor, and so in some sense know, what they are doing in order to make more effective decisions. On the level of action (recall the hesitant squirrel), recent studies of great apes have shown that they can (1) delay taking a smaller reward so as to get a larger reward later, (2) inhibit a previously successful response in favor of a new one demanded by a changed situation, (3) make themselves do something unpleasant for a desired reward at the end, (4) persist through failures, and (5) concentrate through distractions. Specifically, in a comprehensive comparative study, chimpanzees’ ability to do these things was roughly comparable to that of three-year-old human children (Herrmann et al., submitted). These are all skills referred to variously as impulse control, attentional control, emotion regulation, and executive function—though we prefer to use the terms behavioral self-monitoring, for more action-based self-regulation, and cognitive self-monitoring (and in some cases, self-reflection), for more cognitive versions of the process.
Evidence that apes go beyond just behavioral self-monitoring and engage in cognitive self-monitoring comes from several experimental paradigms used with nonhuman primates (sometimes referred to as studies of metacognition). In the best-known paradigm, employed mostly with rhesus monkeys, individuals must make a discrimination (or remember something) to get a highly desirable reward. But if they fail to make the discrimination or remember correctly, they get nothing and must take a time-out before the next trial. The trick is that on each trial individuals have the possibility of opting out of the problem and going for a lesser reward with 100% certainty, which also means no time-out before the next trial. Many individuals thus develop a strategy of opting out of only those discrimination or memory tasks that they are especially likely to fail (Hampton, 2001). They seem to know that they do not know or that they do not remember.
In another paradigm involving chimpanzees, individuals either do or do not see the process of food being hidden in one of several tubes. When they see the hiding process, they choose a tube directly. When they do not see the hiding process, they go to some trouble to look into the tubes and discover where the food is located before choosing. Again, the apes seem to know when they do not know, or at least when they are uncertain, and seek to do something about it. Interesting for interpretation, variables that affect this process in apes are the same as those that affect it in humans: they are more likely to seek extra information if the reward is highly valuable or if it has been longer since they acquired the information (Call, 2010). Thus, as they are assessing situations and deciding what to do, if apes self-monitor and find that they have insufficient information to make an effective decision, this prompts them to gather information as prerequisite to their choice.
The interpretation of these experiments is not totally straightforward, but the apes clearly are doing some kind of self-monitoring and evaluation, just as they do in all of their intelligent decision making. What is new here is that they seem to be monitoring not just imagined actions and their imagined results, or imagined causes and their imagined outcomes, but also their own knowledge or memory—which they then use to make inferences about their likelihood of behavioral success. Great apes and other primates thus have some kind of access, at least in instrumental contexts, to their own psychological states. And even if this is not the fully human version of self-reflection (as we will argue later that it is not—because it lacks a social/perspectival dimension), this ability adds further to the conclusion that great apes are skillful with all three of the key components—abstract cognitive representations (models), protological inferential paradigms, and psychological self-monitoring and evaluation—that constitute what can only be called thinking.
Cognition for Competition
Many theorists continue to maintain a kind of Cartesian picture of the difference between humans and other animals: humans have rational thinking, whereas other animals, including great apes, are simply stimulus-response machines with no ratiocination whatever. This view is held not only by behaviorist psychologists but also by many otherwise very thoughtful philosophers and cognitive scientists. But it is a factually incorrect view, in our opinion, and one that is grounded in an erroneous theory of the evolution of cognition (Darwin, 1859, 1871). Cognitive evolution does not proceed from simple associations to complex cognition, but rather from inflexible adaptive specializations of varying complexities to flexible, individually self-regulated intentional actions underlain by cognitive representations, inferences, and self-monitoring. The empirical research reviewed here (there is much more) clearly demonstrates, in our view, that great apes operate in this flexible, intelligent, self-regulated way—and they do so without language, culture, or any other forms of human-like sociality.
This is not to say, of course, that the interpretations of the studies cited here are the only possible ones. Thus, some theorists would contest the conclusion that great apes understand causal and intentional relations (e.g., Povinelli, 2000; Penn et al., 2008), proposing instead that they operate with some kind of noncognitive “behavioral rules.” Others would propose that instead of causal, intentional, and logical inferences, great apes—just like rats and pigeons—operate only with associations (e.g., Heyes, 2005). And skepticism about cognitive self-monitoring in apes and other animals abounds (e.g., Carruthers and Ritchie, 2012). But behavioral rules (whose nature and origins have never been specified) cannot account for the flexibility with which great apes solve novel physical and social problems (Tomasello and Call, 2006), and association learning takes many dozens of trials to be effective, and this does not accord with the speed and flexibility with which great apes solve novel physical and social problems in experiments (Call, 2006). Although the empirical data are less clear-cut in the case of cognitive self-monitoring, Call’s (2010) finding that the same factors affect the process in humans and great apes is highly suggestive that—in concrete situations, at least—the apes are genuinely self-monitoring the decision-making process.
In any case, our natural history of human thinking begins with this possibly somewhat generous account of great ape thinking. To summarize, thinking comprises three key components, and great apes operate in cognitively sophisticated ways with each of them.
Schematic Cognitive Representations
First is the ability to work with some kind of abstract cognitive representations, to which the individual assimilates particular experiences. According to the best available evidence, great ape abstract cognitive representations—categories, schemas, and models—have three main features.
IMAGISTIC. Great ape cognitive representations are iconic or imagistic in format, based on processes of perceptual and motor experience (for the proposal that human infant cognitive representations are also iconic in format, see Carey, 2009; Mandler, 2012). It is difficult to imagine what else they could be.
SCHEMATIC. Great apes’ imagistic representations are generalized or abstract: schematizations of the organism’s perceptual experience of exemplar situations or entities (i.e., they have type-token structure). Importantly, iconic or imagistic schematizations are not uninterpreted “pictures” but, rather, amalgams of already understood (made relevant to existing cognitive models) exemplars. Thus, when Wittgenstein is searching for what could underlie our most basic processes of understanding, he speculates “imagining the fact”: representing to ourselves the more general and already meaningful cognitive model to which the current situation is best assimilated as exemplar. Such cognitive models are already meaningful because a schematized and generalized understanding of causality and/or intentionality is part and parcel of many of apes’ cognitive models of situations.
SITUATIONAL CONTENT. Great ape cognitive representations have as their most basic content situations, specifically, situations that are relevant to the individual’s goals and values (e.g., that food is present or that a predator is absent). Obviously, representational content structured as whole situations prefigures, in some important sense, human propositional content (though it is not there yet). Apes can also, for specific purposes, schematize their experience with components of situations such as objects and events (e.g., a category of “figs”).
Causal and Intentional Inferences
The second key component is the ability to make inferences from cognitive representations. Great apes use their cognitive categories, schemas, and models productively to imagine or infer nonactual situations. These inferences have two main characteristics.
CAUSALLY AND INTENTIONALLY LOGICAL. Great ape inferences are based on their general understanding of causality and intentionality; they are causal and intentional inferences. But importantly, they still have logical structure—they form paradigms—based on facility with a kind of protoconditional (inferences between cause and effect in both the physical world and social world) and a kind of protonegation (based on mutually exclusive polar opposite or contraries, such as presence-absence). Apes thus possess proto-versions of everything from modus tollens to disjunctive syllogism.
PRODUCTIVE. Great apes’ cognitive representations and inferences are productive or generative in that they can support off-line simulations in which the subject infers or imagines nonactual situations (Barsalou, 1999, 2008). Nevertheless, some theorists might still doubt whether great ape thinking meets Evans’s (1982) generality constraint. In this linguistically inspired account, each potential subject of a thought (or sentence) may be combined with multiple predicates, and each potential predicate may have multiple subjects. To do this nonlinguistically, an individual must be able not only to relate represented situations to one another but also to extract their components and use them in productive combinations to imagine novel situations.
With respect to a particular agent doing multiple things, great apes know, for example, that this leopard does lots of things like climb trees, eat chimpanzees, drink water, and so on. Indirect evidence for this is the fact that great apes pass object permanence tasks in which they must understand that the same object is going different places and doing different things (Call, 2001), and also that they can predict what particular individuals will do in situations based on their past experience with them (Hare et al., 2001). Further evidence is the fact that in experiments great apes individuate objects, so that if they see a particular object go behind a screen, they expect to find that particular object there, and if they see it leave and another replace it, they do not expect to find it there anymore—and if two identical objects go behind the screen they expect to find two objects. They are not “feature placing,” but rather, they are tracking the self-same object or objects engaging in different actions across time (Mendes et al., 2008).
With respect to different individuals doing “the same thing,” great apes know such individual things as leopards climb trees, snakes climb trees, monkeys climb trees—each in their own way. Here things are a bit more difficult evidentially because there are few if any nonverbal methods for investigating event schemas like climbing. But one hypothesis is that a nonverbal way of establishing an event schema is imitation. That is, an individual who imitates another knows at the very least that a demonstrator is doing X and then they themselves can do X—“the same thing”—as well (and perhaps other actors also). Although imitation is not their frontline strategy for social learning, great apes (at least those raised by humans) are nevertheless capable of reproducing the actions of others with some facility in some contexts (e.g., Tomasello et al., 1993; Custance et al., 1995; Buttelmann et al., 2007). Some apes also know when another individual is imitating them, again suggesting at least a rudimentary understanding of self-other equivalence (Haun and Call, 2008). But imitation involves just self and other. Since apes understand the goals of all agents, an alternative hypothesis might be that apes schematize acts of climbing based not on movements but on an understanding that the actor has the goal of getting up the tree—and that goal (not actions per se) provides the basis for an event schema across all individuals, with or without the self.
Great ape cognition thus goes at least some way toward meeting the generality constraint, although productivity may be limited. The claim would be that great ape productive thinking enables an individual to imagine, for example, that if I chase this novel animal it might climb a tree, even if I have never before seen this animal climb a tree. On the other hand, it may be that an ape could not imagine something contrary to fact (i.e., contrary to its causal understanding), such as a leopard flying, as humans are able to do with the aid of external communicative vehicles. Apes’ sense of self-other equivalence may also be limited by the fact that imitation takes place sequentially, whereas much better for establishing self-other equivalence are situations in which the equivalence manifest in a single social interaction simultaneously (e.g., role reversal in the collaborative activities of humans).
Behavioral Self-Monitoring
The third key component of great ape thinking is the ability to self-monitor the decision-making process. Many animal species self-monitor, and even anticipate, the outcomes of their behavioral decisions in the world. But great apes do more than this simple behavioral self-monitoring.
COGNITIVE SELF-MONITORING. Great apes (and some other primate species) also know when they have insufficient information to make an informed behavioral decision. As noted, monitoring outcomes is a basic prerequisite of a self-regulating system, and monitoring simulated outcomes is a characteristic of a cognitive system capable of thinking before it chooses. But monitoring the elements of the decision-making process itself—one’s memory or powers of discrimination or the environmental information available—is a still further enrichment. Self-monitoring of this type implies some kind of “executive” oversight of the decision-making process itself.
And so we may imagine a common ancestor to humans and other great apes. Its daily life was like that of extant nonhuman apes: most waking hours spent in small bands foraging individually for fruit and other vegetation, with various kinds of social interactions, mostly competitive, interspersed. Our hypothesis is that this creature—and also probably australopithecines for the ensuing 4 million years of the human lineage—was individually intentional and instrumentally rational. It cognitively represented its physical and social experience categorically and schematically, and it made all kinds of productive and hypothetical inferences and chains of inferences about its experience as well—all with a modicum of cognitive self-monitoring. And so, the crucial point is that well before the emergence of uniquely human sociality, much less culture, language, and institutions, the foundations for human thinking were securely in place in humans’ last common ancestor with other apes.
Individual intentionality is what is needed for creatures whose social interactions are mainly competitive, that is, creatures that act on their own or, at most, join in with others to choose sides when there is a good fight going on. In virtually all theoretical accounts, great apes’ skills of social cognition evolved mainly for competing with others in the social group: being better or quicker than groupmates at anticipating what potential competitors might do, based on a kind of Machiavellian intelligence (Whiten and Byrne, 1988). And indeed a number of recent studies have found that great apes utilize their most sophisticated skills of social cognition in contexts involving competition or exploitation of others as opposed to contexts involving cooperation or communication with others (e.g., Hare and Tomasello, 2004; see Hare, 2001). Great apes are all about cognition for competition.
Human beings, in contrast, are all about (or mostly about) cooperation. Human social life is much more cooperatively organized than that of other primates, and so, in the current hypothesis, it was these more complex forms of cooperative sociality that acted as the selective pressures that transformed great ape individual intentionality and thinking into human shared intentionality and thinking. Our task now is thus to provide a plausible evolutionary narrative that can take us from humans’ great ape ancestors all the way to modern humans. The shared intentionality hypothesis is that this story comprises a two-step evolutionary sequence: joint intentionality followed by collective intentionality. At both of these transitions the overall process was, at a very general level, the same: a change of ecology led to some new forms of collaboration, which required for their coordination some new forms of cooperative communication, and then together these created the possibility that, during ontogeny, individuals could construct through their social interactions with others some new forms of cognitive representation, inference, and self-monitoring for use in their thinking.