If it is difficult to construct stimuli per se that can only be decoded within a pictorial mode while excluding the other modes, using pictures in a referential task is another way to investigate the ability. We do not eliminate the problem that also pictures seen in reality mode can be used in such a task, or even surface mode if training is allowed, but reference is an important piece of the puzzle, and if subjects can solve referential tasks using e.g. photographs while they cannot decode pictures that are low in realism or depict more dynamic content, we can deduce that the problem is not one predominantly of reference, but depends more on the properties of the me-dium.
Apes have been shown to have problems following indexical cues in a number of experiments (for a review, see e.g. Byrnit, 2005). An indexical cue is one where nearness to the target singles it out, of which perhaps close-range pointing is a para-digmatic example. In order to test whether iconic information would enhance the salience of indexical cues, as well as being informative as a cue on its own, Herrmann et al. (2005) conducted the so far largest, in terms of participants, picto-rial study made with apes. 27 chimpanzees, bonobos, gorillas, and orangutans at the Wolfgang Köhler Primate Research Center in the Leipzig Zoo, Germany, were sub-jected to several versions of an object-choice task.
The first task was simply to choose one of two cups of which the one baited with food was indicated by a photograph or a replica placed on top of it. The photograph (in colour) and the replica (of rubber) depicted the content of the cup. Nothing was placed on the cup that was empty in this experiment. The placement of a cue on one of the cups was made either in view, or occluded from the subject’s sight. Of the 27 subjects 11 managed to chose above chance in at least one of the conditions. Of the 12 chimpanzees only one succeeded. Four of the six orangutans, two of the five go-rillas and all four bonobos likewise succeeded in the task. A clear species difference thus suggests itself in this study, where especially common chimpanzees fell short.
While above chance in three conditions, as a group the subjects did not perform above chance when the replica was used in hidden placement. This implies that the photograph afforded different information than the replica. When the actual place-ment (irrespectively of medium) in full view of the subjects was enough to inform them of the correct choice, this may have been based on other information than the iconic one. In the hidden condition the replica lost its cue value while the photo-graph retained its value. (Or acquired new value.) This time the subjects probably used their recognition of food in the photograph and made their choices accord-ingly.
The authors suggest that the replica, due to its three-dimensionality, was too much an object in itself to serve as a representation, which is in accordance with the dual-representation hypothesis of DeLoache. Somewhat contrary to this explanation
I rather believe that the photograph was successful just because it was perceived as even more “real” than the rubber banana used in this experiment. The photograph used might have captured more of “banana-ness” than the rubber replica did. This effect would have been enhanced by some experience the subjects had of the replica, but not the photograph, before the experiment. To make sure that the subjects could tell the difference they had namely been given a choice between a real banana and the rubber banana. If they chose the replica they were given this to smell and touch. The subjects thus knew beforehand that the rubber object on top of one of the cups was not remotely close to an edible banana. (Or any banana.) The per-ceived iconicity was probably lost with this information. This might be the reason why so many subjects of the 27 failed altogether. They might of course have failed simply because they could not understand indexical cueing throughout the experi-ment, which is not uncommon for novel cues (Tomasello et al., 1997), but to com-plicate the matter they might also have failed to recognise the value of the iconic information of both the replica and the photograph. This information might other-wise have helped them as it did the subjects who performed better with the photo-graph than the replica. However, this experiment cannot arbitrate between perform-ing on indexical versus purely iconic grounds.
The second experiment in Herrmann et al. (2005) takes care of this problem by placing photographs or replicas on both cups for the 11 successful subjects from the previous experiment. The distractor items could either depict colourful non-edible objects or other fruits than the target ones. Also in this test the subjects got to in-spect the replicas but not the photographs prior to testing. It was found that the subjects as a group chose above chance in all conditions except when replicas were used and the distractor item was another fruit. The surprisingly good performance with object, as opposed to fruit, distractors in the replica condition Herrmann and colleagues attribute to the individual scores of three orangutans and one bonobo who seemed to develop a fondness for fruit replicas. When another object was the distractor item they just went for the replica that they found most interesting. Thus they might have appreciated the likeness to fruit, but they did not necessarily use this information in relation to the baiting of the cups. With photographs four sub-jects performed above chance with object distractors and two with fruit distractors.
The task was in other words difficult.
Overall, the subjects in both experiments show great variation between conditions which suggests that other factors than a general “iconic” ability, as defined by the experimenters who chose the stimuli, confounded the responses. A detailed analysis of the individual stimulus items is probably necessary in order to conclude what was going on. In the second experiment only one subject, a gorilla, managed all four conditions. Had only one subject of an original 27 gotten the hang of iconicity, irre-spectively of medium? It seems so. But Herrmann et al. (2005) were not satisfied.
They also wanted to see if iconicity could work independently from indexicality.
Instead of using pictures to label a content, they wanted to use iconicity to commu-nicate the right choice.
In the third experiment very different looking cups and boxes were used to hide the food and the correct choice was cued by holding up a photograph between the containers depicting the correct hiding place. The fourth experiment was similar,
but this time the containers were transparent and both contained food. However, only one of them could be opened and the photograph held up by the experimenter indicated which food that was possible to get hold of. In both these experiments none of the 11 subjects succeeded above chance. But perhaps with time they would, since there was a clear learning effect between the start and the end of testing.
Herrmann et al. (2005) conclude that when indexical information is removed great apes cannot easily substitute this with the communicative intent of the experi-menter. That is, the “reason” for using iconic information in the first experiments was the pictures’ or replicas’ closeness to the containers, while the “reason” in the last two experiments was the helpfulness of the experimenter. How to discover these connections between pictures and baited cups is thus not merely a pictorial problem, but depends on how pictures manifest themselves in relation to other things, such as cups, experimenters and experimenter’s minds. When no relations can be discovered pictures seem to lose their meaning, i.e. usefulness. Pictures in a non-pictorial mode, might be added. In a fully pictorial mode reference is part of the picture concept.
Even though one might not necessary read others’ communicative intentions into a situation, relationships that are not physically salient can still follow from the mere fact that one views a picture as being about something other than itself. In this case the referential act is a private act.
Herrmann et al. (2007) repeated one part of the above study, the combining of iconic and indexical information in an object-choice task, as part of a large battery of tests administered to over a hundred children (2.5 years), chimpanzees and orangu-tans. Unfortunately they lump the result of the iconic test with two other tests (pointing and looking cues) in their report. For this combined group of trials the chimpanzees and orangutans were 63% and 65% correct respectively. The human children were significantly better at 84% correct.
In a study by Tomasello et al. (1997) only chimpanzees and orangutans that had experience with human pointing or the placement of a marker could solve an object-choice task that involved these cues. Showing a replica of the container that har-boured a reward was not informative for the subjects, mirroring the findings in the third and fourth experiments in Herrmann et al. (2005). Not understanding the communicative intent of the experimenter was evoked as an explanation in both studies. It should be noted that in Tomasello et al. (1997) only seven out of forty-eight children were above chance when a replica of the correct target item was used as a communicative cue. Human children had arguably extensive experience of toys and replicas compared to apes. The dual-representation hypothesis of DeLoache is mentioned as an explanation, i.e. that the replicas were too interesting as objects in their own right to serve as signs for something else, but also the lack of indexical information is blamed. The conclusion is thus that “[…] any problems children had did not concern the comprehension of communicative intentions, but rather con-cerned their understanding of how the particular sign functioned in the context of this particular game” (Tomasello et al., 1997, p. 1078). Applying a double standard, the data for the apes, although “not definite on the issue,” was judged to be indica-tive of a lack of comprehension of communicaindica-tive intention, while the data for the children was blamed on the stimuli.
Although framed as a test of the ability of language and non language-trained chim-panzees to delay gratification, Beran et al. (1999) is, like the above experiments, also a direct comparison between stimulus types, of which photographs is one, as well as a test of understanding the referential nature of these.
Subjects were two chimpanzees trained in the use of lexigrams (i.e. arbitrary graphic symbols), Lana (Rumbaugh, 1977) and Sherman (Savage-Rumbaugh, 1986), and a non language-trained control subject, Mercury, from the Language Research Center of Georgia State University, USA. All had experience in cognitive tasks of various sorts. The present task was to say no to an immediate reward in fa-vour for one that was given three minutes later. Following training, three conditions were given: the immediate and deferred food visually present, the respective foods being designated by laminated photographs, and the foods being represented only by their lexigrams.
In a control session aimed at making sure that the subjects understood the stim-uli, they were given a choice of a photograph of their preferred food or a photograph of a less preferred food. They all chose the picture of the preferred food. Likewise, the two lexigram competent subjects chose lexigrams that designated their preferred food before lexigrams designating their non-preferred food. Apprehending the motif of photographs was thus not a problem for any of the subjects. Likewise, when the preferred food, photograph or lexigram was put in the immediate reward position, which was a bowl by a bell-button that was to be pressed if one wanted the contents of the bowl, all subjects pressed the button to receive the immediate reward. When the preferred food instead was in the delay position, i.e. a bowl further away whose content was given to the subject only if it had refrained from ringing the bell for three minutes, the story was very different. All three subjects managed to inhibit the want for the direct food in order to receive the delayed food in about half of the tri-als or less. Likewise for the lexigrams, performance was low but significant, but only for the symbolically trained subjects Lana and Sherman. When it came to photo-graphs, only Sherman reliably delayed gratification. In fact he was equally good as when the reward bowls were baited with actual food. Lana and Mercury could not delay gratification when photographs were used but pressed the button before the three minutes were up.
It seems that the two subjects that did not delay gratification in the photograph condition did not only fail to see the connection between the photographs and the foods that were given in reward, they also failed to see the photographs as real food.
If they had performed in a total reality mode, they should as easily delay a response to a photograph as to a real food item. But they did not. They seem to have been stuck between two modes. On the one hand they differentiated the photographs from real food, simultaneously as apprehending their content, but on the other they could not attribute a referential function to the pictures in the task at hand. It did not occur to them that the photographs designated foods that were to be given later, in place of the pictures. That the food was placed directly in the bowls, while the photographs and lexigrams were placed against the front of the bowls, might possi-bly have contributed to this effect. The correspondence might have been clearer if also the photographs were placed in the bowls.
The authors explain the lack of pictorial performance with the subject’s rearing histories. Both chimpanzees had extensive experience with pictures as enrichment items, but none were “trained or exposed to photographs as representational sym-bols, and [they have] not used them as such during [their] lifetime” (Beran et al., 1999, p. 125). Sherman on the other hand, the only one who delayed in all three conditions, learned during his early training that “not only lexigrams but also pho-tographs and labels could represent other things in the world. Therefore, for Sherman, a photograph or lexigram representing a food produced results the same as having the food itself present” (p. 125). This makes sense. A more puzzling finding in Beran et al (1999) is why all three subjects pressed the bell when the preferred photograph or lexigram was in the immediate bowl, especially since Mercury did not know lexigrams. Perhaps it was simply due to the fact that they did not see a point in waiting for the delayed reward, because it was non-preferred, regardless of what was attributed to the immediate reward bowl. Thus, it does not need to mean that they understood the role of photographs or lexigrams in the immediate condition but not in the delayed condition.
Sherman’s linguistic as well as pictorial training is described in Savage-Rumbaugh (1986), and a crucial part of it in Savage-Rumbaugh et al. (1980). In this seminal paper, Reference: The Linguistic Essential, Sherman and his companion Austin, as well as Lana, are required to sort objects, photographs, and lexigrams as “food” or
“tool” into two bins.
Lana, at the time 8 years old, and Sherman and Austin, 5 and 4 years old respec-tively, had very different language training. It is probably this diverse background that Beran et al. (1999) refers to. Also pictures were likely used differently in the two projects. Lana’s training with lexigrams focused on symbol sequencing and object naming (see Rumbaugh, 1977), while Sherman and Austin had been involved in the pragmatic use of symbols in communication (see Savage-Rumbaugh, 1986). Conse-quently the relationship between lexigrams and objects tended to look very different for the subjects in the two projects. Sherman and Austin had been required to ask for objects when they specifically needed them in a problem-solving situation. For this reason their vocabulary became heavily tied to the respective use of objects. If they were not allowed to manipulate objects they initially had problems naming them. Lana, on the other hand, could readily name objects that she had only visual access to, but it did not easily occur to her that she could use these names to request objects in other contexts. Her training had been to use specific sequences of lexi-grams to request foods and favours of her trainers and her computer, but these inter-actions did usually not start with a problem to be solved.
The result of their diverse training manifested itself in the sorting study of Sav-age-Rumbaugh et al. (1980). Following extensive training, all three chimpanzees could sort six objects either as “food” or “tool,” physically into two bins, and also by naming them with the lexigrams “tool” and “food.” Given 10 novel objects, Sherman and Austin categorised them correctly on trial one. Lana, on the other hand, identified only three items, suggesting that she had not attributed the con-cepts “tool” and “food” (or something like “non-edible” and “edible”), but had learned to pair individual objects with lexigrams associatively. However, when
re-quired to sort the 10 novel objects into the food and tool bins, without labelling, she sorted them all correctly. Thus it seemed that the problem did not lie in conceptual-ising the two categories, but to encode this in terms of lexigrams.
Sherman and Austin were given a further 28 items which they could sort without difficulties as “tools” and “food.” This ability generalised to photographs for Sherman but not for Austin.46 Despite training both of them to criterion by taping photographs to the respective objects, and then require them to label the photo-graphs on their own, only Sherman continued doing so when 10 novel photophoto-graphs were presented. Austin had for some reason not treated the novel photographs, and probably neither the training photographs, as “representations of real objects” (Sav-age-Rumbaugh et al., 1980). Austin seemed to have learned photographs as Lana had learned lexigrams.
However, the problem was found to be attributable to the medium. The pictures were enclosed in thick plastic casings which produced artefacts, such as reflecting light. While Sherman reduced these by moving his head to get a clear view, Austin never did so, suggesting that it had never occurred to him that something informa-tive was lurking in there.47 When the experimenters encouraged Austin to look more carefully and slowly rotated the stimuli so as to give him opportunity to catch the content, Austin correctly identified novel photographs. This is a simple but impor-tant finding that must be kept in mind in all analysis of negative results with picto-rial stimuli. The medium can obscure the message.
The last step was the most critical of the study. Sherman and Austin were to label lexigrams with lexigrams, the first experimentally controlled display of completely detached symbolic manipulation in a nonhuman species. But before the crucial test, again training was employed. Lexigrams were taped to photographs of objects and both were classified as either “tools” or “foods.” Then lexigrams alone were labelled.
When, in the crucial test, novel items were interspersed among the training items, Sherman and Austin correctly labelled these as “tools” and “foods” on their first at-tempts.
It is noteworthy that Savage-Rumbaugh and colleagues seem to have chosen pho-tographs as a middle step, or a bridge, between objects and lexigrams. However, from a representational viewpoint, there is nothing middle about such a step. If the photographs were seen as icons they would be en par, in terms of being signs, with symbols, i.e. the lexigrams. If, on the other hand, they were seen as iconicities devoid of sign function, performance would be in the realm of actions with real objects and the photographs would only be additional training items. Photographs used in e.g. a sorting or naming task need not be seen as representations at all. In fact, Austin’s confusion with the plastic medium strongly suggests that he did not intuitively as-cribe useful iconic information to sheets of plastic. Not until he could recognise something familiar in them did they become useful, not the other way around.
46 Lana was dropped from the experiment after her failure to apply the “tool” and “food” lexigrams.
47 Three year old human children are quite poor at compensating inadequate viewing angles of pic-tures (Olson et al, 1980).